xformer

xFormers: A Modular and Hackable Transformer Modelling Library

================


The xFormers library is an open-source, lightweight library designed for researchers and developers working on modern transformer-based models. It provides a set of customizable building blocks that can be easily composed to create complex transformer architectures. xFormers focuses on providing efficient, composable building blocks that can be used across various domains, including vision, natural language processing, and more.

Design Philosophy


At the core of xFormers' design philosophy is the goal of providing a modular and easily extendible framework that allows researchers to build transformer models quickly and efficiently. By using independent, customizable building blocks, xFormers encourages experimentation and exploration of new models and techniques.

Building Blocks


xFormers provides a wide range of building blocks that can be used to construct transformer models. These include:

  • Attention mechanisms: All the supported attention mechanisms, including scaled dot product, sparse attention, BlockSparse attention, and Linformer-like attention. These attention mechanisms can be easily mixed and matched to create custom attention profiles.
  • Feedforward blocks: A variety of feedforward blocks are provided, including MLP, FusedMLP, Mixture of Experts, Conv2DFeedforward, and PositionalEmbedding. These blocks can be used to implement various transformations in the transformer architecture.
  • layer_norm: A composable layer normalization function that can be used to normalize the output of feedforward blocks.
  • MultiHeadDispatch: An optimized multi-head scheduling function that can be used to parallelize the computation across multiple attention heads.
  • Create custom ops: xFormers provides a mechanism for creating custom operators that can be used to extend the functionality of the library.
    Performance Considerations

xFormers is designed to be performant, especially for deep learning applications. It includes its own CUDA kernels for efficient execution, and dispatches to other libraries when relevant to optimize performance. The library is also designed to be easy to maintain, allowing for updates and improvements to be made in a backwards-compatible manner.

Community and Support


xFormers has a vibrant community of developers and researchers who use the library for their work. The project maintains an active GitHub repository where discussions, questions, and contributions take place. The library is also featured on the PyTorch Hub, which serves as a community-driven hub for sharing pre-trained models, tutorials, and experience reports.

Citing xFormers


If you use xFormers in your research, please consider citing the following publication:

@Misc{xFormers2022,
 author = {Benjamin Lefaudeux and Francisco Massa and Diana Liskovich and Wenhan Xiong and Vittorio Caggiano and Sean Naren and Min Xu and Jieru Hu and Marta Tintore and Susan Zhang and Patrick Labatut and Daniel Haziza and Luca Wehrstedt and Jeremy Reizenstein and Grigory Sizov},
 title = {xFormers: A modular and hackable Transformer modelling library},
 howpublished = {\url{https://github.com/facebookresearch/xformers}},
 year = {2022}
}
```----------

xFormers is a versatile, efficient, and composable transformer modelling library that provides researchers with the building blocks needed to build modern transformer models. Its design philosophy of modularity and ease of maintenance, combined with its performant implementation and active community, makes it a valuable resource for anyone working on transformer-based projects.

Leave a Reply

Your email address will not be published. Required fields are marked *