My Projects
Cross-Domain
Pre-Training for Time-Series Foundation Models
ICLR 2025, Foundation Models in the Wild Workshop
Despite the success of cross-domain pre-training in language models,
time-series data pose unique challenges due to distinct temporal
patterns and sampling discrepancies. This study systematically evaluates
whether cross-domain pre-training benefits time-series foundation models
(TSFMs). We reveal that while it can enhance performance in some
domains, it also introduces negative transfer in others.
Counterintuitively, unrelated domains can be helpful, while related ones
may degrade performance. Our findings underscore the need for tailored
pre-training strategies and provide actionable insights for developing
effective TSFMs.
Benchmarking
Point and Distributional Forecasting across Diverse Prediction
Horizons
NeurIPS 2024, Datasets and Benchmarks Track [GitHub]
Delivering precise point and distributional forecasts across a spectrum
of prediction horizons represents a significant and enduring challenge
in the application of time-series forecasting within various industries.
While there is a rising trend in developing universal forecasting
models, a thorough understanding of their advantages and drawbacks,
especially regarding essential forecasting needs like point and
distributional forecasts across short and long horizons, is still
lacking. We introduce ProbTS, a comprehensive benchmark encompassing 26
models and 16 datasets across domains. We dissect the distinctive data
characteristics arising from disparate forecasting requirements and
elucidate how these characteristics can skew methodological preferences
in typical research trajectories, which often fail to fully accommodate
essential forecasting needs. Building on this, we examine the latest
models for universal time-series forecasting and discover that our
analyses of methodological strengths and weaknesses are also applicable
to these universal models.
Towards
Robust Varied-Horizon Forecasting with Elastic Time-Series
Transformer
NeurIPS 2024 [GitHub]
Robust forecasting across varied horizons is essential yet underexplored
in current time-series models. We introduce Elastic Time-Series
Transformer (ElasTST), a non-autoregressive model with structured
attention masks to ensure horizon-invariant outputs and tunable rotary
position embeddings to facilitate extrapolation capability. Its
multi-scale patch design captures both fine- and coarse-grained
patterns, and a horizon reweighting strategy simulates multi-horizon
training. Experiments show that ElasTST achieves up to 20% improvement
in long-horizon accuracy (1024 steps).
Multi-Scale
Modeling for Irregular-Sampled Time Series
SIGKDD 2023, Research Track [GitHub]
Irregular sampling poses challenges in time-series modeling, especially
when both intra-series and inter-series discrepancies. We present
Warpformer, the first multi-scale architecture for irregular-sampled
time series, with a learnable warping module to normalize sampling
scales and custom attention layers for multi-scale representation
learning. Warpformer achieves up to 7% accuracy improvement across
downstream tasks on real-world clinical benchmarks.
Knowledge-Enhanced
Domain Adaptation in Few-Shot Relation Classification
SIGKDD 2021, Research Track [GitHub]
Relation classification (RC) is a key task in knowledge extraction but
typically requires large annotated datasets. While few-shot RC models
perform well in general domains, their effectiveness drops sharply when
adapting to specialized fields like medicine. We propose KEFDA, a
knowledge-enhanced few-shot RC model for domain adaptation, which
integrates both general and domain-specific knowledge graphs to improve
cross-domain generalization. By leveraging concept-level semantics,
KEFDA captures relation types from limited examples and transfers this
capability across domains. It combines a knowledge-augmented
prototypical network for instance matching with a relation-meta learning
module for implicit relation reasoning. On the FewRel 2.0 domain
adaptation benchmark, KEFDA achieves state-of-the-art performance,
ranking 1st in multiple subtasks with an average improvement of 6.63%
over the runner-up.
Distant
Supervision for Polyphone Disambiguation in Mandarin
Chinese.
INTERSPEECH 2020, Oral Presentation
Grapheme-to-phoneme (G2P) conversion is vital for Mandarin Chinese
text-to-speech (TTS), where polyphone disambiguation is a key challenge.
Existing models rely heavily on manually annotated data, which suffer
from limited coverage and imbalanced distributions. We propose a
distantly supervised framework that predicts character pronunciations
using automatically aligned character-phoneme pairs to train a Seq2Seq
model with attention. A phoneme-level language model is also introduced
to mitigate noise in the generated data. Without relying on syntactic
features or pre-trained embeddings, our method achieves competitive
performance, particularly improving accuracy on imbalanced polyphonic
characters. Overall classification accuracy increases from 88.39% to
93.53%, with results more aligned with natural pronunciation
patterns.