Model Compression AI News & Updates

Multiverse Computing Releases Ultra-Compact AI Models for Edge Device Deployment

European AI startup Multiverse Computing has released two extremely small AI models called SuperFly (94M parameters) and ChickBrain (3.2B parameters) that can run locally on smartphones, IoT devices, and laptops without internet connection. The models use quantum-inspired compression technology called CompactifAI to achieve high performance despite their tiny size, with ChickBrain even outperforming the original Llama 3.1 8B model on several benchmarks.

Spanish Startup Raises $215M for AI Model Compression Technology Reducing LLM Size by 95%

Spanish startup Multiverse Computing raised €189 million ($215M) Series B funding for its CompactifAI technology, which uses quantum-computing inspired compression to reduce LLM sizes by up to 95% without performance loss. The company offers compressed versions of open-source models like Llama and Mistral that are 4x-12x faster and reduce inference costs by 50%-80%, enabling deployment on devices from PCs to Raspberry Pi. Founded by quantum physics professor Román Orús and former banking executive Enrique Lizaso Olmos, the company claims 160 patents and serves 100 customers globally.

Microsoft Develops Efficient 1-Bit AI Model Capable of Running on Standard CPUs

Microsoft researchers have created BitNet b1.58 2B4T, the largest 1-bit AI model to date with 2 billion parameters trained on 4 trillion tokens. This highly efficient model can run on standard CPUs including Apple's M2, demonstrates competitive performance against similar-sized models from Meta, Google, and Alibaba, and operates at twice the speed while using significantly less memory.