German colossal, cleaned Common Crawl corpus (GC4) releasedTags:german-datatext-corpuscommon-crawlCategories:DataNLP