A new AI model called Falcon LLMs uses just 75% of GPT-3’s training compute, 40% of Chinchilla’s, and 80% of PaLM-62B’s.

Falcon-40B

Technology Innovation Institute (TII) created Falcon-40B, a potent decoder-only model that was trained on a huge quantity of data made up of 1,000B tokens from RefinedWeb and selected corpora. The TII Falcon LLM License makes this model available.

One of the better open-source models available is the Falcon-40B. On the OpenLLM Leaderboard, it outperforms models like LLaMA, StableLM, RedPajama, and MPT in terms of performance.

One of the notable features of Falcon-40B is its optimized architecture for inference. It incorporates FlashAttention, as introduced by Dao et al. in 2022, and multi-query, as described by Shazeer et al. in 2019. These architectural enhancements contribute to the model’s superior performance and efficiency during inference tasks.

It is important to note that Falcon-40B is a raw, pre-trained model, and further fine-tuning is typically recommended to tailor it to specific use cases. However, for applications involving generic instructions in a chat format, a more suitable alternative is Falcon-40B-Instruct.

Falcon-40B is made available under the TII Falcon LLM License, which permits commercial use of the model. Details regarding the license can be obtained separately.

A paper providing further details about Falcon-40B will be released soon. The availability of this high-quality open-source model presents a valuable resource for researchers, developers, and businesses in various domains.

Falcon 7B

Falcon-7B is a highly advanced causal decoder-only model TII (Technology Innovation Institute) developed. It boasts an impressive parameter count of 7B and has been trained on an extensive dataset of 1,500B tokens derived from RefinedWeb, further enhanced with curated corpora. This model is made accessible under the TII Falcon LLM License.

One of the primary reasons for choosing Falcon-7B is its exceptional performance compared to other similar open-source models like MPT-7B, StableLM, and RedPajama. The extensive training on the enriched RefinedWeb dataset contributes to its superior capabilities, as demonstrated on the OpenLLM Leaderboard.

Falcon-7B incorporates an architecture explicitly optimized for inference tasks. The model benefits from integrating FlashAttention, a technique introduced by Dao et al. in 2022, and multi-query, as described by Shazeer et al. in 2019. These architectural advancements enhance the model’s efficiency and effectiveness during inference operations.

It is worth noting that Falcon-7B is available under the TII Falcon LLM License, which grants permission for commercial utilization of the model.

Detailed information about the license can be obtained separately.

While a paper providing comprehensive insights into Falcon-7B is yet to be published, the model’s exceptional features and performance make it an invaluable asset for researchers, developers, and businesses across various domains.