How to use Auto Model For SequenceClassification for Multi-Class Text Classification?

I am trying to use Hugginface’s AutoModelForSequence Classification API for multi-class classification but am confused about its configuration.

My dataset is in one hot encoded and the problem type is multi-class (one label at a time)

What I have tried:

from transformers import AutoModelForSequenceClassification

model = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased",
                                                           num_labels=6,
                                                           id2label=id2label,
                                                           label2id=label2id)



batch_size = 8
metric_name = "f1"



from transformers import TrainingArguments, Trainer

args = TrainingArguments(
    f"bert-finetuned-english",
    evaluation_strategy = "epoch",
    save_strategy = "epoch",
    learning_rate=2e-5,
    per_device_train_batch_size=batch_size,
    per_device_eval_batch_size=batch_size,
    num_train_epochs=10,
    weight_decay=0.01,
    load_best_model_at_end=True,
    metric_for_best_model=metric_name,
    #push_to_hub=True,
)


trainer = Trainer(
    model,
    args,
    train_dataset=encoded_dataset["train"],
    eval_dataset=encoded_dataset["test"],
    tokenizer=tokenizer,
    compute_metrics=compute_metrics
)

is it correct?

I am confused about the loss function, when I am printing one forward pass the loss is BinaryCrossEntropyWithLogits

SequenceClassifierOutput([('loss',
                           tensor(0.6986, grad_fn=<BinaryCrossEntropyWithLogitsBackward0>)),
                          ('logits',
                           tensor([[-0.5496,  0.0793, -0.5429, -0.1162, -0.0551]],
                                  grad_fn=<AddmmBackward0>))])

which is used for multi-label or binary classification tasks. It should use nn.CrossEntropyLoss?

How to properly use this API for multiclass and define the loss function?

Also looking for this, did you figure it out?