[Submitted on 29 Oct 2025]
OrthoSign: A Critical Analysis of Hybrid Orthogonalization and Sign-Based Optimization
View PDFAbstract:This paper presents a thorough investigation of OrthoSign, a novel optimizer combining orthogonal weight updates with sign-based adaptation for language model training. Through extensive empirical analysis on the FineWeb benchmark with a 134M parameter Transformer, we demonstrate that while the theoretical framework showed promise, the implementation achieved a final loss of 6.584 - significantly underperforming both the Muon (3.537) and AdamW (4.927) baselines. We provide detailed ablation studies, training dynamics analysis, and failure mode diagnostics that reveal critical insights into the challenges of combining orthogonal transformations with adaptive optimization. Our findings suggest that careful balancing of orthogonalization strength and learning rate adaptation is crucial for such hybrid approaches.
Submission history
[v1] Wed, 29 Oct 2025 14:03 UTC