Google AI Introduces JaxPruner: An Open-Source JAX-Based Pruning And Sparse Training Library For Machine Learning Research


Deep learning efficiency may be improved by doing active research in sparsity. However, greater cooperation between hardware, software, and algorithm research is necessary to take advantage of sparsity and realize its potential in practical applications. Such partnerships frequently call for a versatile toolkit to facilitate concept development quickly and assessment against various dynamic benchmarks. In neural networks, sparsity might appear in the activations or the parameters. The major objective of JaxPruner is parameter sparsity. This is due to earlier studies showing that it can perform better than dense models with the same number of parameters. 

The scientific community has used JAX more often over the past few years. JAX’s distinct division between functions and states distinguishes it from well-known deep learning frameworks like PyTorch and TensorFlow. Furthermore, parameter sparsity is a good candidate for hardware acceleration due to its independence from data. This research focuses on two methods for obtaining parameter sparsity: pruning, which tries to create sparse networks from dense networks for efficient inference, and sparse training, which aims to develop sparse networks from scratch while lowering training costs. 

This reduces the time needed to implement difficult concepts by making function transformations like taking gradients, hessian computations, or vectorization very simple. Similarly, it is simple to alter a function when its complete state is contained in a single location. These qualities also make it simpler to construct shared procedures across several pruning and sparse training methods, as they shall soon explore. There needs to be a comprehensive library for sparsity research in JAX, despite implementations of certain techniques and sparse training with N: M sparsity and quantization. This inspired researchers from Google Research to create JaxPruner. 

They want JaxPruner to support sparsity research and aid in our ability to respond to crucial queries like “Which sparsity pattern achieves a desired trade-off between accuracy and performance?” and “Can sparse networks be trained without first training a large dense model?” To accomplish these objectives, they were guided by three principles when creating the library: Fast Integration Machine learning research moves quickly. As a result of the wide range of ML applications, there are a lot of codebases that are constantly evolving. The ease of usage of new research concepts is also closely tied to their adaptability. As a result, they sought to make it easier for others to integrate JaxPruner into existing codebases. 

To do this, JaxPruner employs the well-known Optax optimization library, needing minor adjustments for integration with current libraries. Parallelization and checkpointing are made simple because state variables required for pruning and sparse training techniques are kept with the optimization state. Study First Research projects frequently need the execution of many algorithms and baselines, and as a result, they substantially benefit from quick prototyping. JaxPruner does this by committing to a generic API used by several algorithms, which makes switching between various algorithms very simple. They attempt to make their algorithms simple to change and offer implementations for popular baselines. Additionally, switching between popular sparsity structures is very simple.

There are an increasing variety of methods (CPU acceleration, activation sparsity, etc.) for accelerating sparsity in neural networks. However, integration with current frameworks is frequently lacking, making it relatively challenging to use these advancements, particularly in research. JaxPruner adheres to the custom of utilizing binary masks for introducing sparsity, which introduces some more operations and necessitates additional storage for the masks, given their primary objective of enabling research. The main goal of their research was to reduce this minimal overhead. The code is open source and can be found on GitHub, along with tutorials.


Check out the Paper and GitHub link. Don’t forget to join our 20k+ ML SubRedditDiscord Channel, and Email Newsletter, where we share the latest AI research news, cool AI projects, and more. If you have any questions regarding the above article or if we missed anything, feel free to email us at [email protected]

???? Check Out 100’s AI Tools in AI Tools Club


Aneesh Tickoo is a consulting intern at MarktechPost. He is currently pursuing his undergraduate degree in Data Science and Artificial Intelligence from the Indian Institute of Technology(IIT), Bhilai. He spends most of his time working on projects aimed at harnessing the power of machine learning. His research interest is image processing and is passionate about building solutions around it. He loves to connect with people and collaborate on interesting projects.