training datasets AI News & Updates

Research Breakthrough

EleutherAI released The Common Pile v0.1, an 8-terabyte dataset of licensed and open-domain text developed over two years with multiple partners. The dataset was used to train two AI models that reportedly perform compar...

Open Source AI Transparency Copyright EleutherAI training datasets

-0.03% 0 days

+0.02% 0 days

Full analysis

training datasets AI News & Updates

EleutherAI Creates Massive Licensed Dataset to Train Competitive AI Models Without Copyright Issues