publications

Selected research on uncertainty-aware vision-language-action models, reinforcement learning, human feedback, safety, multimodal alignment, and human-centered autonomy.

Latest profile: Google Scholar. * indicates equal contribution.

selected publications

  1. arXiv 2026 Uncertainty Quantification for Flow-Based Vision-Language-Action Models thumbnail
    2026

    Uncertainty Quantification for Flow-Based Vision-Language-Action Models

    Ralf Römer, Maximilian Seeliger, Saida Liu, Ben Sturgis, Marco Bagatella, Daniel Marta, Andreas Krause, and Angela P. Schoellig

    arXiv preprint arXiv:2606.18043, 2026

    Introduces Velocity-Field Disagreement (VFD) for epistemic uncertainty in flow-based VLAs and SAVE, an uncertainty-guided active fine-tuning framework that needs at least 22% fewer expert demonstrations than baselines.

  2. ICRA 2026 MOSAIC: Multi-objective Optimization from Zero-Shot Language Reasoning in Preference-based RL thumbnail
    2026

    MOSAIC: Multi-objective Optimization from Zero-Shot Language Reasoning in Preference-based RL

    Daniel Marta*, Simon Holk*, and Iolanda Leite

    In IEEE International Conference on Robotics and Automation (ICRA), 2026

    Reframes preference-based RL as multi-objective learning: language explanations are parsed into objective-specific labels, weights, and highlights for scalarized policy optimization.

  3. ICML + SPOT 2026 Reinforcement Learning via Self-Distillation thumbnail
    2026

    Reinforcement Learning via Self-Distillation

    Jonas Hübotter, Frederike Lübeck*, Lejs Behric*, Anton Baumann*, Marco Bagatella, Daniel Marta, Ido Hakimi, Idan Shenfeld, Thomas Kleine Buening, Carlos Guestrin, and Andreas Krause

    In International Conference on Machine Learning (ICML), 2026; also accepted to the ICLR 2026 Workshop on Scaling Post-training for LLMs (SPOT)

    Introduces Self-Distillation Policy Optimization (SDPO), converting rich environment feedback into dense self-distillation signals for more sample-efficient RL with language models.

  4. ICRA 2024 SEQUEL: Semi-Supervised Preference-based RL with Query Synthesis via Latent Interpolation thumbnail
    2024

    SEQUEL: Semi-Supervised Preference-based RL with Query Synthesis via Latent Interpolation

    Daniel Marta*, Simon Holk*, Christian Pek, Jana Tumova, and Iolanda Leite

    In IEEE International Conference on Robotics and Automation, 2024

    Improves preference-learning sample efficiency by augmenting human feedback with synthesized preference queries from latent interpolation.

  5. ICRA 2024 POLITE: Preferences Combined with Highlights in Reinforcement Learning thumbnail
    2024

    POLITE: Preferences Combined with Highlights in Reinforcement Learning

    Simon Holk, Daniel Marta, and Iolanda Leite

    In IEEE International Conference on Robotics and Automation, 2024

    Combines preference feedback with temporal highlights to improve granularity and representation learning. Nominated for Best HRI Paper, Best Student Paper, and Best Conference Paper.

  6. HRI 2024 PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning thumbnail
    2024

    PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning

    Simon Holk*, Daniel Marta*, and Iolanda Leite

    In ACM/IEEE International Conference on Human-Robot Interaction, 2024

    Uses zero-shot language-model reasoning over optional textual descriptions to align learned rewards with human preferences.

  7. WACV 2024 Human-Centric Autonomous Systems With LLMs for User Command Reasoning thumbnail
    2024

    Human-Centric Autonomous Systems With LLMs for User Command Reasoning

    Yi Yang, Qingwen Zhang, Ci Li, Daniel Marta, Nazre Batool, John Folkesson

    In WACV LLVM-AD Workshop, 2024

    Explores few-shot LLM reasoning for inferring autonomous-system requirements from in-cabin natural-language commands. Best Student Paper Award.

  8. IROS 2023 VARIQuery: VAE Segment-based Active Learning for Query Selection in Preference-based Reinforcement Learning thumbnail
    2023

    VARIQuery: VAE Segment-based Active Learning for Query Selection in Preference-based Reinforcement Learning

    Daniel Marta*, Simon Holk*, Christian Pek, Jana Tumova, and Iolanda Leite

    In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2023

    Proposes a VAE-based active-learning strategy for diverse and informative preference-query selection.

  9. ICRA 2023 Aligning Human Preferences with Baseline Objectives in Reinforcement Learning thumbnail
    2023

    Aligning Human Preferences with Baseline Objectives in Reinforcement Learning

    Daniel Marta, Simon Holk, Christian Pek, Jana Tumova, and Iolanda Leite

    In IEEE International Conference on Robotics and Automation, 2023

    Narrows policy search with baseline objectives and requests human feedback when preferences matter most.

  10. RA-L 2021 Human-feedback shield synthesis for perceived safety in deep reinforcement learning thumbnail
    2021

    Human-feedback shield synthesis for perceived safety in deep reinforcement learning

    Daniel Marta, Christian Pek, Gaspar I. Melsion, Jana Tumova, and Iolanda Leite

    In IEEE Robotics and Automation Letters, 2021

    Learns shield parameters from human feedback to obtain policies perceived as safe.

awards & recognition