Kunlong Chen
YOU?
Author Swipe
View article: WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training
WSM: Decay-Free Learning Rate Schedule via Checkpoint Merging for LLM Pre-training Open
Recent advances in learning rate (LR) scheduling have demonstrated the effectiveness of decay-free approaches that eliminate the traditional decay phase while maintaining competitive performance. Model merging techniques have emerged as pa…
View article: Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models
Towards Greater Leverage: Scaling Laws for Efficient Mixture-of-Experts Language Models Open
Mixture-of-Experts (MoE) has become a dominant architecture for scaling Large Language Models (LLMs) efficiently by decoupling total parameters from computational cost. However, this decoupling creates a critical challenge: predicting the …
View article: Iron artefacts used at an ancient jade mine in the Hexi Corridor: a technical observation
Iron artefacts used at an ancient jade mine in the Hexi Corridor: a technical observation Open
This paper presents a study of 45 iron objects unearthed from the ancient Jingbao’er jade mining site in the western Hexi Corridor, dating from the Warring States period to the early West Han dynasty (about fourth–first centuries BCE). Met…
View article: BOSE: A Systematic Evaluation Method Optimized for Base Models
BOSE: A Systematic Evaluation Method Optimized for Base Models Open
This paper poses two critical issues in evaluating base models (without post-training): (1) Unstable evaluation during training: in the early stages of pre-training, the models lack the capability to answer questions as required, leading t…
View article: Revisiting Mechanism of NaOH Dechlorination Treatments for Bronze Conservation in Quantitative Study
Revisiting Mechanism of NaOH Dechlorination Treatments for Bronze Conservation in Quantitative Study Open
Dechlorination is a crucial strategy for archeological bronze stabilization to resist corrosion induced by cuprous chloride (CuCl). Conventional samples, either archeological or simulated ones, have deficiencies in revealing dechlorination…
View article: Segmentation and visualization of the Shampula dragonfly eye glass bead CT images using a deep learning method
Segmentation and visualization of the Shampula dragonfly eye glass bead CT images using a deep learning method Open
Micro-computed tomography (CT) of ancient Chinese glass dragonfly eye beads has enabled detailed exploration of their internal structures, contributing to our understanding of their manufacture. Segmentation of these CT images is essential…
View article: A Study on Metallurgical Artifacts Excavated from Luojiaba Site H235 in the Eastern Sichuan Region during the Eastern Han Dynasty
A Study on Metallurgical Artifacts Excavated from Luojiaba Site H235 in the Eastern Sichuan Region during the Eastern Han Dynasty Open
Currently, research remains limited on ironworking workshops in China and even throughout East Asia. The discovery of Luojiaba Site H235 in 2021 provides significant new material on this issue. This paper comprehensively organized the meta…
View article: Archaeometric study of the iron objects from the <i>Xuechi</i> sacrificial site and its implication for bloomery iron smelting during early Western Han period in China
Archaeometric study of the iron objects from the <i>Xuechi</i> sacrificial site and its implication for bloomery iron smelting during early Western Han period in China Open
Metallographic examination and compositional study of slag inclusions on iron objects unearthed from Xuechi in Shaanxi, China, have revealed the smelting and manufacturing techniques employed at this Western Han dynasty sacrificial site. T…
View article: Site formation process of the Dadong Paleolithic site in Jilin province, China: A geoarchaeological approach
Site formation process of the Dadong Paleolithic site in Jilin province, China: A geoarchaeological approach Open
The Dadong site, located in the Changbaishan region of Jilin province, China, is an important Upper Paleolithic site characterized by its large distribution area and abundant stone artifacts. This study presents a geoarchaeological study o…
View article: GP-NAS-ensemble: a model for NAS Performance Prediction
GP-NAS-ensemble: a model for NAS Performance Prediction Open
It is of great significance to estimate the performance of a given model architecture without training in the application of Neural Architecture Search (NAS) as it may take a lot of time to evaluate the performance of an architecture. In t…
View article: X-ray computed tomography reveals special casting techniques used with unusual bronze objects unearthed from the Sanxingdui site
X-ray computed tomography reveals special casting techniques used with unusual bronze objects unearthed from the Sanxingdui site Open
Scholars in a wide range of disciplines are interested in the casting techniques used to create the extraordinary bronze objects unearthed from the two pits of the Sanxingdui site. Although researchers have carried out a number of studies …
View article: DQN Control Solution for KDD Cup 2021 City Brain Challenge
DQN Control Solution for KDD Cup 2021 City Brain Challenge Open
We took part in the city brain challenge competition and achieved the 8th place. In this competition, the players are provided with a real-world city-scale road network and its traffic demand derived from real traffic data. The players are…
View article: Glassmaking of the Qing Dynasty: A Review, New Data, and New Insights
Glassmaking of the Qing Dynasty: A Review, New Data, and New Insights Open
Full major and minor chemical compositions, including F, B, As, and Cl, have been produced quantitatively using EPMA for glass samples originating from the Qing Dynasty glass collection of the Bristol Museum and Art Gallery and glass sampl…
View article: Question Directed Graph Attention Network for Numerical Reasoning over Text
Question Directed Graph Attention Network for Numerical Reasoning over Text Open
Numerical reasoning over texts, such as addition, subtraction, sorting and counting, is a challenging machine reading comprehension task, since it requires both natural language understanding and arithmetic computation. To address this cha…
View article: SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check Open
Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the simila…
View article: Question Directed Graph Attention Network for Numerical Reasoning over Text
Question Directed Graph Attention Network for Numerical Reasoning over Text Open
Kunlong Chen, Weidi Xu, Xingyi Cheng, Zou Xiaochuan, Yuyu Zhang, Le Song, Taifeng Wang, Yuan Qi, Wei Chu. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2020.
View article: Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Towards Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning Open
The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities. Multi-criteria Chinese word segmentation aims to capture various annotation criteria among datasets and leverage the…
View article: SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check
SpellGCN: Incorporating Phonological and Visual Similarities into Language Models for Chinese Spelling Check Open
Chinese Spelling Check (CSC) is a task to detect and correct spelling errors in Chinese natural language. Existing methods have made attempts to incorporate the similarity knowledge between Chinese characters. However, they take the simila…
View article: Symmetric Regularization based BERT for Pair-wise Semantic Reasoning
Symmetric Regularization based BERT for Pair-wise Semantic Reasoning Open
The ability of semantic reasoning over the sentence pair is essential for many natural language understanding tasks, e.g., natural language inference and machine reading comprehension. A recent significant improvement in these tasks comes …
View article: Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning
Toward Fast and Accurate Neural Chinese Word Segmentation with Multi-Criteria Learning Open
The ambiguous annotation criteria lead to divergence of Chinese Word Segmentation (CWS) datasets in various granularities. Multi-criteria Chinese word segmentation aims to capture various annotation criteria among datasets and leverage the…
View article: Convolutional sequence to sequence non‐intrusive load monitoring
Convolutional sequence to sequence non‐intrusive load monitoring Open
A convolutional sequence to sequence non‐intrusive load monitoring model is proposed in this study. Gated linear unit convolutional layers are used to extract information from the sequences of aggregate electricity consumption. Residual bl…