My Projects

Cross-Domain Pre-Training for Time-Series Foundation Models
ICLR 2025, Foundation Models in the Wild Workshop
Despite the success of cross-domain pre-training in language models, time-series data pose unique challenges due to distinct temporal patterns and sampling discrepancies. This study systematically evaluates whether cross-domain pre-training benefits time-series foundation models (TSFMs). We reveal that while it can enhance performance in some domains, it also introduces negative transfer in others. Counterintuitively, unrelated domains can be helpful, while related ones may degrade performance. Our findings underscore the need for tailored pre-training strategies and provide actionable insights for developing effective TSFMs.

Benchmarking Point and Distributional Forecasting across Diverse Prediction Horizons
NeurIPS 2024, Datasets and Benchmarks Track [GitHub]
Delivering precise point and distributional forecasts across a spectrum of prediction horizons represents a significant and enduring challenge in the application of time-series forecasting within various industries. While there is a rising trend in developing universal forecasting models, a thorough understanding of their advantages and drawbacks, especially regarding essential forecasting needs like point and distributional forecasts across short and long horizons, is still lacking. We introduce ProbTS, a comprehensive benchmark encompassing 26 models and 16 datasets across domains. We dissect the distinctive data characteristics arising from disparate forecasting requirements and elucidate how these characteristics can skew methodological preferences in typical research trajectories, which often fail to fully accommodate essential forecasting needs. Building on this, we examine the latest models for universal time-series forecasting and discover that our analyses of methodological strengths and weaknesses are also applicable to these universal models.

Towards Robust Varied-Horizon Forecasting with Elastic Time-Series Transformer
NeurIPS 2024 [GitHub]
Robust forecasting across varied horizons is essential yet underexplored in current time-series models. We introduce Elastic Time-Series Transformer (ElasTST), a non-autoregressive model with structured attention masks to ensure horizon-invariant outputs and tunable rotary position embeddings to facilitate extrapolation capability. Its multi-scale patch design captures both fine- and coarse-grained patterns, and a horizon reweighting strategy simulates multi-horizon training. Experiments show that ElasTST achieves up to 20% improvement in long-horizon accuracy (1024 steps).

Multi-Scale Modeling for Irregular-Sampled Time Series
SIGKDD 2023, Research Track [GitHub]
Irregular sampling poses challenges in time-series modeling, especially when both intra-series and inter-series discrepancies. We present Warpformer, the first multi-scale architecture for irregular-sampled time series, with a learnable warping module to normalize sampling scales and custom attention layers for multi-scale representation learning. Warpformer achieves up to 7% accuracy improvement across downstream tasks on real-world clinical benchmarks.

Knowledge-Enhanced Domain Adaptation in Few-Shot Relation Classification
SIGKDD 2021, Research Track [GitHub]
Relation classification (RC) is a key task in knowledge extraction but typically requires large annotated datasets. While few-shot RC models perform well in general domains, their effectiveness drops sharply when adapting to specialized fields like medicine. We propose KEFDA, a knowledge-enhanced few-shot RC model for domain adaptation, which integrates both general and domain-specific knowledge graphs to improve cross-domain generalization. By leveraging concept-level semantics, KEFDA captures relation types from limited examples and transfers this capability across domains. It combines a knowledge-augmented prototypical network for instance matching with a relation-meta learning module for implicit relation reasoning. On the FewRel 2.0 domain adaptation benchmark, KEFDA achieves state-of-the-art performance, ranking 1st in multiple subtasks with an average improvement of 6.63% over the runner-up.

Distant Supervision for Polyphone Disambiguation in Mandarin Chinese.
INTERSPEECH 2020, Oral Presentation
Grapheme-to-phoneme (G2P) conversion is vital for Mandarin Chinese text-to-speech (TTS), where polyphone disambiguation is a key challenge. Existing models rely heavily on manually annotated data, which suffer from limited coverage and imbalanced distributions. We propose a distantly supervised framework that predicts character pronunciations using automatically aligned character-phoneme pairs to train a Seq2Seq model with attention. A phoneme-level language model is also introduced to mitigate noise in the generated data. Without relying on syntactic features or pre-trained embeddings, our method achieves competitive performance, particularly improving accuracy on imbalanced polyphonic characters. Overall classification accuracy increases from 88.39% to 93.53%, with results more aligned with natural pronunciation patterns.