About 127,000 results
Open links in new tab
  1. 聊聊Sparse Autoencoder对于LLM解释性的重塑 - 知乎

    稀疏自编码器(Sparse Autoencoder, SAE) 矩阵分解效率低的原因是“正交基”的假设非常强,所以 SAE在尝试优化一个更软的约束条件:基向量在数据上的分布尽可能稀疏。

  2. 使用稀疏自编码器(Sparse Autoencoders, SAEs)提升语言模型 …

    以下是基于论文《Sparse Autoencoders Find Highly Interpretable Features in Language Models》(arXiv:2309.08600v3)的实验代码实现,涵盖训练稀疏自编码器(Sparse Autoencoder, …

  3. A Survey on Sparse Autoencoders: Interpreting the Internal …

    Mar 7, 2025 · Among various mechanistic interpretability approaches, Sparse Autoencoders (SAEs) have emerged as a promising method due to their ability to disentangle the complex, …

  4. Sparse Autoencoders in Deep Learning - GeeksforGeeks

    Nov 27, 2025 · To learn efficient data representations with minimal redundancy, Sparse Autoencoders play an important role in deep learning. They are a special type of autoencoder …

  5. These notes describe the sparse autoencoder learning algorithm, which is one approach to automatically learn features from unlabeled data.

  6. We develop a state-of-the-art methodology to reliably train extremely wide and sparse autoencoders with very few dead latents on the activations of any language model. We …

  7. 【机器学习】SAE (Sparse Autoencoders)稀疏自编码器 - CSDN博客

    Jun 13, 2025 · SAE (Sparse Autoencoders)稀疏自编码器 0.引言 大模型 一直被视为一个“黑箱”,研究人员对其内部神经元如何相互作用以实现功能的机制尚不清楚。 因此研究机理 可解释 …

  8. Sparse Autoencoder for Mechanistic Interpretability - GitHub

    A sparse autoencoder model, along with all the underlying PyTorch components you need to customise and/or build your own: Encoder, constrained unit norm decoder and tied bias …

  9. 浅谈Sparse Auto Encoder - 知乎

    SAE (Sparse Auto Encoder)作为Auto Encoder的一种,通过稀疏化激活的方式,常被用于从大语言模型中提取可解释的特征。 但最近 cocomix 等一系列工作的出现又揭示了SAE作为Auto …

  10. An Intuitive Explanation of Sparse Autoencoders for LLM ...

    Jun 11, 2024 · A sparse autoencoder transforms the input vector into an intermediate vector, which can be of higher, equal, or lower dimension compared to the input. When applied to …