BPE-tokenized OpenWebText
Emerging1papers using it
2026first seen
'BPE-tokenized OpenWebText' is a dataset that contains text data processed using Byte Pair Encoding (BPE) for tokenization, and it is used to evaluate the performance of models on various language tasks.