Computer Vision · Multimodal Learning

Chenyuan Qu

Tech Lead in AI/ML · PhD StudentAllsee · Vieunite · University of Birmingham

I work across research and production AI systems, combining doctoral work in computer vision and multimodal learning with applied machine-learning systems for search, recommendation, and generative workflows.

Research

Research Interests

My recent work spans interpretable image representations, multimodal scene understanding, diffusion-based methods, and dataset-building for visual learning.

Computer Vision

Learning interpretable representations and robust visual understanding from images, scenes, and multimodal observations.

Multimodal Learning

Studying how visual, spatial, audio, textual, and viewpoint-specific signals can be fused for richer scene understanding.

Generative AI

Exploring controllable generative systems that connect representation learning with editing, restoration, and creative workflows.

AI for Science

Applying machine learning methods in scientifically grounded settings where interpretability and structure matter.

Publications

Selected Publications

BMVC

Exploring Image Representation with Decoupled Classical Visual Descriptors

Chenyuan Qu, Hao Chen, Jianbo Jiao

VisualSplit learns image representations from decoupled edges, colour segmentation, and grey-level histograms, enabling descriptor-to-image reconstruction, controllable editing, and diffusion-based restoration.

Year

2025

Venue

British Machine Vision Conference (BMVC)

CVPR

360+x: A Panoptic Multi-modal Scene Understanding Dataset

Hao Chen, Yuqi Hou, Chenyuan Qu, Irene Testini, Xiaohan Hong, Jianbo Jiao

A panoptic multimodal dataset that combines panoramic, frontal, and egocentric viewpoints with audio, location, and textual signals for richer scene understanding benchmarks.

Year

2024

Venue

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Publications

Publications & Datasets

2025
  • VisualSplit
    BMVCSpotlight

    BMVC

    2025

    VisualSplit

    Exploring Image Representation with Decoupled Classical Visual Descriptors

    Chenyuan Qu, Hao Chen, Jianbo Jiao

    British Machine Vision Conference (BMVC)

    VisualSplit learns image representations from decoupled edges, colour segmentation, and grey-level histograms, enabling descriptor-to-image reconstruction, controllable editing, and diffusion-based restoration.

  • Official DIFF pipeline overview showing diffusion feature extraction and fusion for cross-domain semantic segmentation.
    ICASSPSpotlight

    ICASSP

    2025

    Diffusion Features to Bridge Domain Gap for Semantic Segmentation

    Yuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu

    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

    DIFF leverages diffusion-model features to improve cross-domain semantic segmentation by extracting and fusing semantically rich representations across the diffusion process.

    arXivPaperCodeDOI
2024
  • 360+x
    CVPRSpotlight

    CVPR

    2024

    360+x

    360+x: A Panoptic Multi-modal Scene Understanding Dataset

    Hao Chen, Yuqi Hou, Chenyuan Qu, Irene Testini, Xiaohan Hong, Jianbo Jiao

    IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    A panoptic multimodal dataset that combines panoramic, frontal, and egocentric viewpoints with audio, location, and textual signals for richer scene understanding benchmarks.

2023
  • MeD
    ICCVSpotlight

    ICCV

    2023

    MeD

    Multi-view Self-supervised Disentanglement for General Image Denoising

    Hao Chen, Chenyuan Qu, Yu Zhang, Chen Chen, Jianbo Jiao

    IEEE/CVF International Conference on Computer Vision (ICCV)

    A self-supervised denoising framework that disentangles clean image structure from corruption by comparing multiple noisy views of the same latent scene.

Datasets

BinEgo-360

2025

BinEgo-360

A binocular egocentric and 360° panoramic multimodal dataset and challenge surface for scene understanding, aligned with spatial audio, text, and geo-metadata.

2026

2026

text-to-art-database

A privacy-safe text-to-image dataset released on Hugging Face, repacked into Parquet shards with embedded image bytes and organised into samples and iterations splits.

News

Recent News

2025

VisualSplit accepted to BMVC 2025

Project page, paper, supplementary material, code, and model weights are publicly available.

2025

DIFF presented at ICASSP 2025

Collaborative work on diffusion features for cross-domain semantic segmentation.

2024

360+x published at CVPR 2024

Dataset paper with accompanying benchmark resources, code, and public dataset access.

2023

Started my PhD in the MI X group

I began doctoral research in 2023 on computer vision and multimodal learning in the MI X Group.

2023

MeD published at ICCV 2023

An early project on self-supervised image denoising using multi-view disentanglement.

Experience

Professional Experience

2023 — Present

PhD Student

University of Birmingham · MI X Group

Doctoral research in computer vision, multimodal learning, and generative AI within the School of Computer Science.

Aug 2022 — Present

Tech Lead in AI/ML

Allsee · Vieunite

Lead applied AI/ML work across production systems for product discovery, creative generation, and internal tooling.

  • Build and ship machine-learning systems for recommendation, search, and multimodal product understanding.
  • Develop generative workflows for creative and merchandising use cases across the Vieunite stack.
  • Work across model development and product delivery, connecting ML pipelines with full-stack engineering.
Feb 2023 — Present

Research Assistant

University of Birmingham

Research on interpretable hydrological modelling and machine-learning methods for scientific analysis.

Jul 2020 — Sep 2020

Algorithm Engineer Intern

AsiaInfo Software Co. Ltd

Developed and optimised machine-learning components for a visual customer-service anchor in a mobile deployment setting.

Education

Academic Training

2021 — 2022

Master's Study

University of Birmingham

Postgraduate study completed at Birmingham before beginning doctoral research.

2018 — 2021

BSc in Physics

University of Southampton

Undergraduate training in physics, which continues to shape how I think about machine learning and computer vision.

Contact

Contact

The easiest way to reach me is by email.

University of Birmingham

cxq134@student.bham.ac.uk