A high-performance computing framework for variable selection in GWAS that integrates state-of-the-art methods and employs novel optimization strategies for efficient processing of high-dimensional data with sparse characteristics.
Dec 3, 2024
A controllable and personalized UV Map generative model that fine-tunes a pre-trained text-to-image diffusion model with a face fusion module for ID-driven customized generation, addressing the challenges of personalized texture map generation and quality evaluation.
Oct 28, 2024
In this study, we have successfully integrated NGP into DIR, a novel contribution that significantly enhances the accuracy and efficiency of medical image alignment as demonstrated on the DIR-lab dataset. The NGPDIR framework exhibits robust performance across various metrics, particularly in landmark alignment precision and the accommodation of anatomical sliding boundaries. This advancement not only propels the DIR field forward but also opens new avenues for real-time clinical applications, potentially transforming patient care with its rapid, reliable imaging capabilities.
Jul 8, 2024
A neural clustering framework (FEC) for visual representation learning that views feature extraction as selecting representatives from data, automatically capturing data distribution and providing interpretable cluster assignments.
Jun 17, 2024
A Vision-Language Model Goes 4D (VG4D) framework that transfers VLM knowledge to 4D point cloud networks for improved video recognition, achieving state-of-the-art performance on action recognition datasets.
May 13, 2024
A novel one-shot federated learning method that performs layer-wise posterior aggregation, achieving superior performance by aggregating posterior distributions instead of point estimates.
Jan 1, 2024
In this work, we propose the Skeleton-in-Context, designed to process multiple skeleton-base tasks simultaneously after just one training time. Specifically, we build a novel skeleton-based in-context benchmark covering various tasks. In particular, we propose skeleton prompts composed of TGP and TUP, which solve the overfitting problem of skeleton sequence data trained under the training framework commonly applied in previous 2D and 3D in-context models. Besides, we demonstrate that our model can generalize to different datasets and new tasks, such as motion completion. We hope our research builds the first step in the exploration of in-context learning for skeleton-based sequences, which paves the way for further research in this area.
Dec 15, 2023
We propose Point-In-Context (PIC), the first framework adopting the in-context learning paradigm for 3D point cloud understanding. Specifically, we set up an extensive dataset of point cloud pairs with four fundamental tasks to achieve in-context ability. We propose effective designs that facilitate the training and solve the inherited information leakage problem. PIC shows its excellent learning capacity, achieves comparable results with single-task models, and outperforms multitask models on all four tasks. Besides, it shows good generalization ability to out-of-distribution samples and unseen tasks and has great potential via selecting higher-quality prompts. We hope it paves the way for further exploration of in-context learning in the 3D modalities.
Dec 10, 2023
This paper proposes the Pose and Mesh Co-Evolution network (PMCE), a new two-stage pose-to-mesh framework for recovering 3D human mesh from a monocular video. PMCE frst estimates 3D human pose motion in terms of spatial and temporal domains, then performs image-guided pose and mesh interactions by our proposed AdaLN that injects body shape information while preserving their spatial structure. Extensive experiments on popular datasets show that PMCE outperforms state-of-the-art methods in both perframe accuracy and temporal consistency. We hope that our approach will spark further research in 3D human motion estimation considering both pose and shape consistency.
Oct 2, 2023
This paper presents a joint Caption Grounding and Generation (CGG) framework for instance-level open vocabulary segmentation. The main contributions are: (1) using fine-grained object nouns in captions to improve grounding with object queries. (2) using captions as supervision signals to extract rich information from other words helps identify novel categories. To our knowledge, this paper is the first to unify segmentation and caption generation for open vocabulary learning. The proposed framework significantly improves OVIS and OSPS and comparable results on OVOD without pre-training on large-scale datasets.
Oct 2, 2023
An Interweaved Graph and Attention Network (IGANet) for 3D human pose estimation that enables bidirectional communication between GCNs and attentions, capturing both global and local correlations in human skeleton representations.
Jun 4, 2023
A Graph-Aware Transformer (GATOR) framework for 3D human mesh recovery from 2D pose, combining Graph-Aware Transformer encoder and Motion-Disentangled Regression decoder to capture joint-joint, joint-vertex, and vertex-vertex relations.
Jun 4, 2023
We propose Prototypical Cross-Attention Network (PCAN), capable of leveraging rich spatio-temporal information for online multiple object tracking and segmentation.
Sep 12, 2021
We present Quasi-Dense Similarity Learning, which densely samples hundreds of region proposals on a pair of images for contrastive learning.
Feb 20, 2021
Feb 13, 2021
Self-attention is not better than the matrix decomposition~(MD) model developed 20 years ago regarding the performance and computational cost for encoding the long-distance dependencies.
Sep 28, 2020
Jul 3, 2020
A Position Constraint Loss (PCLoss) method for fashion landmark estimation that constrains error landmark locations by utilizing position relationships, applicable to both regression and heatmap-based methods without modifying network structure.
May 4, 2020
Feb 24, 2020
We leverage each object’s category, geometry and appearance features to perform relational embedding, and output a relation matrix that encodes overlap relations. In order to overcome the lack of supervision, we introduce a differentiable module to resolve the overlap between any pair of instances.
Feb 7, 2020
We analyze the effects of time stepping on the Euler method and ResNets. We establish a stability condition for ResNets with step sizes and weight parameters, and point out the effects of step sizes on the stability and performance. Inspired by our analyses, we develop an adaptive time stepping controller that is dependent on the parameters of the current step, and aware of previous steps.
Feb 7, 2020
A Self-Refining Deep Symmetry Enhanced Network (DSEN) for rain removal that extracts rotation equivariant features and uses a self-refining mechanism to remove accumulated rain streaks in a coarse-to-fine manner.
Sep 22, 2019
We formulate the attention mechanism into an expectation-maximization manner and iteratively estimate a much more compact set of bases upon which the attention maps are computed.
Jul 22, 2019
We propose a novel neural network architecture to reduce streak artifacts generated in sparse-view 2D Cone Beam Computed To-mography (CBCT) image reconstruction.
Jun 19, 2019
We propose a novel deep network architecture based on deep convolutional and recurrent neural networksfor single image deraining.
Jul 19, 2018