Wang, Ziyue
YOU?
Author Swipe
View article: Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model
Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model Open
Mechanical design and manufacturing workflows conventionally begin with conceptual design, followed by the creation of a computer-aided design (CAD) model and fabrication through material-extrusion (MEX) printing. This process requires con…
View article: Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model
Image2Gcode: Image-to-G-code Generation for Additive Manufacturing Using Diffusion-Transformer Model Open
Mechanical design and manufacturing workflows conventionally begin with conceptual design, followed by the creation of a computer-aided design (CAD) model and fabrication through material-extrusion (MEX) printing. This process requires con…
View article: SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking
SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking Open
Surgical video segmentation is crucial for computer-assisted surgery, enabling precise localization and tracking of instruments and tissues. Interactive Video Object Segmentation (iVOS) models such as Segment Anything Model 2 (SAM2) provid…
View article: SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking
SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking Open
Surgical video segmentation is crucial for computer-assisted surgery, enabling precise localization and tracking of instruments and tissues. Interactive Video Object Segmentation (iVOS) models such as Segment Anything Model 2 (SAM2) provid…
View article: SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking
SAM2S: Segment Anything in Surgical Videos via Semantic Long-term Tracking Open
Surgical video segmentation is crucial for computer-assisted surgery, enabling precise localization and tracking of instruments and tissues. Interactive Video Object Segmentation (iVOS) models such as Segment Anything Model 2 (SAM2) provid…
View article: Cross-platform Clinical Proteomics using the Charité Open Standard for Plasma Proteomics (OSPP)
Cross-platform Clinical Proteomics using the Charité Open Standard for Plasma Proteomics (OSPP) Open
we present the Charité Open Peptide Standard for plasma proteomics (OSPP), an open resource composed of 211 isotope-labeled peptides, intended to be used as an internal standard for plasma and serum proteomic projects. The OSPP was designe…
View article: Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models
Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models Open
Object hallucination in Large Vision-Language Models (LVLMs) significantly impedes their real-world applicability. As the primary component for accurately interpreting visual information, the choice of visual encoder is pivotal. We hypothe…
View article: MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models
MUCAR: Benchmarking Multilingual Cross-Modal Ambiguity Resolution for Multimodal Large Language Models Open
Multimodal Large Language Models (MLLMs) have demonstrated significant advances across numerous vision-language tasks. Due to their strong performance in image-text alignment, MLLMs can effectively understand image-text pairs with clear me…
View article: DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms
DongbaMIE: A Multimodal Information Extraction Dataset for Evaluating Semantic Understanding of Dongba Pictograms Open
Dongba pictographic is the only pictographic script still in use in the world. Its pictorial ideographic features carry rich cultural and contextual information. However, due to the lack of relevant datasets, research on semantic understan…
View article: Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models
Diving into Mitigating Hallucinations from a Vision Perspective for Large Vision-Language Models Open
Object hallucination in Large Vision-Language Models (LVLMs) significantly impedes their real-world applicability. As the primary component for accurately interpreting visual information, the choice of visual encoder is pivotal. We hypothe…
View article: Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation
Structure Matters: Revisiting Boundary Refinement in Video Object Segmentation Open
Given an object mask, Semi-supervised Video Object Segmentation (SVOS) technique aims to track and segment the object across video frames, serving as a fundamental task in computer vision. Although recent memory-based methods demonstrate p…
View article: Functional Time Series Forecasting of Distributions: A Koopman-Wasserstein Approach
Functional Time Series Forecasting of Distributions: A Koopman-Wasserstein Approach Open
We propose a novel method for forecasting the temporal evolution of probability distributions observed at discrete time points. Extending the Dynamic Probability Density Decomposition (DPDD), we embed distributional dynamics into Wasserste…
View article: MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow
MedAgent-Pro: Towards Evidence-based Multi-modal Medical Diagnosis via Reasoning Agentic Workflow Open
In modern medicine, clinical diagnosis relies on the comprehensive analysis of primarily textual and visual data, drawing on medical expertise to ensure systematic and rigorous reasoning. Recent advances in large Vision-Language Models (VL…
View article: EgoLife: Towards Egocentric Life Assistant
EgoLife: Towards Egocentric Life Assistant Open
We introduce EgoLife, a project to develop an egocentric life assistant that accompanies and enhances personal efficiency through AI-powered wearable glasses. To lay the foundation for this assistant, we conducted a comprehensive data coll…
View article: <b>Mowing intensity and duration reshape and decouple plant and microbial communities</b>
<b>Mowing intensity and duration reshape and decouple plant and microbial communities</b> Open
We collected 363 peer-reviewed articles related to mowing (3555 data points) and conducted a meta-analysis. The indicators covered the impacts of mowing on plant biomass, richness, and microbial growth and metabolism. The focus was on anal…
View article: <b>Mowing intensity and duration reshape and decouple plant and microbial communities</b>
<b>Mowing intensity and duration reshape and decouple plant and microbial communities</b> Open
We collected 363 peer-reviewed articles related to mowing (3555 data points) and conducted a meta-analysis. The indicators covered the impacts of mowing on plant biomass, richness, and microbial growth and metabolism. The focus was on anal…