by Sam Havens and Erica Ji Yuen
Today, we are releasing MPT-7B-8K, a 7B parameter open-source LLM with 8k context length trained with the MosaicML platform. MPT-7B-8K was pretrained starting from the MPT-7B checkpoint in 3 days on 256 NVIDIA H100s with an additional 500B tokens of data.
We are pleased to announce an addition to our MPT Foundation Series with the release of MPT-7B-8k. With its 8k context length, MPT-7B-8K specializes in document summarization and question-answering. Like all the other models in the MPT Foundation Series, MPT-7B-8k is optimized for faster training and inference, and can be finetuned on domain-specific data on the MosaicML platform.
Today, we are releasing 3 models:
Whether you're looking to streamline workflows, improve the understanding of complex documents, or simply save time and effort, MPT-7B-8k on the MosaicML platform is a great starting point for businesses looking to add reasoning capabilities to their language data.
MPT-7B-8k is:
In addition, MPT-7B-8k models perform similarly or better than other open source 8K context length models on our in-context learning evaluation harness. To learn more about our in-context learning evaluation harness and to see full results comparing different open-source LLMs - check out our new LLM evaluation page.
If you prefer seeing qualitative results - feel free to download the model and share your results in our Community Slack, or check out a long-context length reading comprehension example at the bottom of the blog.


Want to get started with deploying LLMs trained and customized on your data? The MosaicML platform gives you the tools and infrastructure to easily and efficiently build, customize, and deploy the MPT-style models on your secure cloud of choice. If you're interested in training and deploying your own MPT or LLMs on the MosaicML platform, sign up here.
In the following example we have provided MPT-7B-8K-Instruct with a long passage about rats from the MCAS Grade 10 English Language Arts Reading Comprehension Exam and asked it to give an explanation of something mentioned in the text. The model gives a reasonable answer given the content of the paragraph.


When would I choose…
Subscribe to our blog and get the latest posts delivered to your inbox.