News

Harvard Unleashes Huge AI Training Dataset

Harvard University will release an AI training data set based on around one million books (public domain books only, unlike the controversial and much smaller Books3 dataset). This is part of the college’s new Institutional Data Initiative, funded by both Microsoft and Open AI. The data set is intended to help companies that can’t afford to assemble similar data on their own.