IndoNLP | Data Catalogue

Load this dataset directly with the Datasets library

First, install dependency and clone NusaCrowd

pip install datasets
git clone https://github.com/IndoNLP/nusa-crowd.git

Then, the dataset can be downloaded locally by the python script below:

from datasets import load_dataset

path = "nusa-crowd/nusacrowd/nusa_datasets/TMP"
dataset = load_dataset(path)

# see dataset sample:
print(dataset['train'].to_pandas())

Copied!