Hong Liu

Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation
Uncertainty-Aware Testing-Time Optimization for 3D Human Pose Estimation

In this paper, we propose an Uncertainty-Aware testing-time Optimization (UAO) framework for 3D human pose estimation. During the training process, we propose the GUMLP to estimate 3D results and uncertainty values for each joint. For test-time optimization, our UAO framework freezes the pre-trained network parameters and optimizes a latent state initialized by the input 2D pose. To constrain the optimization direction in both 2D and 3D spaces, projection and uncertainty constraints are applied. Extensive experiments show that our approach achieves state-of-the-art performance on two popular datasets

Jun 15, 2025

MiLNet: Multiplex Interactive Learning Network for RGB-T Semantic Segmentation
MiLNet: Multiplex Interactive Learning Network for RGB-T Semantic Segmentation

A novel module-free Multiplex Interactive Learning Network (MiLNet) for RGB-T semantic segmentation that integrates multi-model, multi-modal, and multi-level feature learning through asymmetric simulated learning and inverse hierarchical fusion strategies.

Mar 3, 2025

HYRE: Hybrid Regressor for 3D Human Pose and Shape Estimation
HYRE: Hybrid Regressor for 3D Human Pose and Shape Estimation

A novel Hybrid Regressor (HYRE) that combines parametric and non-parametric paradigms for 3D human pose and shape estimation, bridging the gap between physically plausible and pixel-aligned results through joint learning.

Dec 25, 2024

Audio–visual keyword transformer for unconstrained sentence‐level keyword spotting
Audio–visual keyword transformer for unconstrained sentence‐level keyword spotting

An Audio–Visual Keyword Transformer (AVKT) network for keyword spotting in unconstrained video clips, using transformer classifier with learnable CLS tokens and decision fusion to achieve high accuracy in both clean and noisy conditions.

Feb 1, 2024

Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video
Co-Evolution of Pose and Mesh for 3D Human Body Estimation from Video

This paper proposes the Pose and Mesh Co-Evolution network (PMCE), a new two-stage pose-to-mesh framework for recovering 3D human mesh from a monocular video. PMCE frst estimates 3D human pose motion in terms of spatial and temporal domains, then performs image-guided pose and mesh interactions by our proposed AdaLN that injects body shape information while preserving their spatial structure. Extensive experiments on popular datasets show that PMCE outperforms state-of-the-art methods in both perframe accuracy and temporal consistency. We hope that our approach will spark further research in 3D human motion estimation considering both pose and shape consistency.

Oct 2, 2023

Interweaved Graph and Attention Network for 3D Human Pose Estimation
Interweaved Graph and Attention Network for 3D Human Pose Estimation

An Interweaved Graph and Attention Network (IGANet) for 3D human pose estimation that enables bidirectional communication between GCNs and attentions, capturing both global and local correlations in human skeleton representations.

Jun 4, 2023

Gator: Graph-Aware Transformer with Motion-Disentangled Regression for Human Mesh Recovery from a 2D Pose
Gator: Graph-Aware Transformer with Motion-Disentangled Regression for Human Mesh Recovery from a 2D Pose

A Graph-Aware Transformer (GATOR) framework for 3D human mesh recovery from 2D pose, combining Graph-Aware Transformer encoder and Motion-Disentangled Regression decoder to capture joint-joint, joint-vertex, and vertex-vertex relations.

Jun 4, 2023

PCLoss: Fashion landmark estimation with position constraint loss
PCLoss: Fashion landmark estimation with position constraint loss

In this paper, we design a Position Constraint Loss (PCLoss) for fashion landmark estimation, which incorporates the position correlation into landmark estimation models. Specifically, the PCLoss adds a regular term for each landmark to regularize their relative positions. Compared with other alternatives, our PCLoss effectively mitigates the outliers and duplicate detection problems without modifying existing CNN architectures. In addition, our skeleton-like optimization method further strengthens the position constraints between landmarks. The proposed method can be applied to both regression and heatmap based methods and it provides a novel perspective towards position relation learning in key point estimation tasks. Extensive experimental results on three challenging datasets, DeepFashion, FLD and FashionAI, demonstrate that our method outperforms other state-of-the-art methods. The experiment on COCO 2017 shows the potential applications of PCLoss for other key point estimation tasks, which can be explored more in future work.

Oct 1, 2021

Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification
Bi-directional Exponential Angular Triplet Loss for RGB-Infrared Person Re-Identification

We propose a novel ranking loss function, named Bi-directional Exponential Angular Triplet Loss, to help learn an angularly separable common feature space by explicitly constraining the included angles between embedding vectors.

Dec 12, 2020

Position Constraint Loss For Fashion Landmark Estimation
Position Constraint Loss For Fashion Landmark Estimation

A Position Constraint Loss (PCLoss) method for fashion landmark estimation that constrains error landmark locations by utilizing position relationships, applicable to both regression and heatmap-based methods without modifying network structure.

May 4, 2020

Spatial Pyramid Based Graph Reasoning for Semantic Segmentation
Spatial Pyramid Based Graph Reasoning for Semantic Segmentation

Feb 24, 2020

Self-Refining Deep Symmetry Enhanced Network for Rain Removal
Self-Refining Deep Symmetry Enhanced Network for Rain Removal

A Self-Refining Deep Symmetry Enhanced Network (DSEN) for rain removal that extracts rotation equivariant features and uses a self-refining mechanism to remove accumulated rain streaks in a coarse-to-fine manner.

Sep 22, 2019

Expectation Maximization Attention Networks for Semantic Segmentation
Expectation Maximization Attention Networks for Semantic Segmentation

We formulate the attention mechanism into an expectation-maximization manner and iteratively estimate a much more compact set of bases upon which the attention maps are computed.

Jul 22, 2019

Multi-label classification of PCB defects based on convolutional neural network
Multi-label classification of PCB defects based on convolutional neural network

A multi-label classification method based on Convolutional Neural Network for PCB defect detection that can simultaneously identify multiple defect types, improving detection accuracy and efficiency.

Oct 22, 2018

Recurrent Squeeze-and-Excitation Net for Single Image Deraining
Recurrent Squeeze-and-Excitation Net for Single Image Deraining

We propose a novel deep network architecture based on deep convolutional and recurrent neural networksfor single image deraining.

Jul 19, 2018