【Atmacup #17】 Take a look at 1st solution
This time, take a look at the 1st solution of Atmacup #17.
・Solution
0. Highlights
Sorry, I'm unfamiliar with LLM.
Chapter 2:
・LoRA with LLM
・Pusedo label
・Decide the weights by Nelder-Mead
Chapter 3:
・LanguageModelHead instead of SequenceClassificationHead
・Instruction Tuning
1. Competition Overview
In this competition, participants challenged themselves to predict 'whether the reviewer will recommend the clothes they reviewed.'
1.1 Metric
AUC
1.2 Data
・Review content
・Reviewer's age
・Attribution info of clothes
2. Solution
CV
StratifiedKFold(4split)
Model (cv/public/private)
- gemma-2-9b-it (0.9764/0.9760/0.9793)
- LM Head
- LoRA
- r=8, alpha=16, target_modules=("q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj")
- learning_rate = 1e-4
- 1epoch
- RTX4090、1fold/2h(about)
- deberta-v3-large (0.9740/NA/NA)
- gemmaのテストの擬似ラベルをconcatして訓練
- training with concat to the pseudo label of gemma test
- ensemble (0.9770/0.9763/0.9796)
- simple mean
- Decided the weights by Nelder-Mead
3. Training and Inference by LM head of Gemma
・Using LanguageModelHead, not SequenceClassificationHead.
・Assign the class label(0/1) to Yes/No token and Instruction Tuning
・When inference, input until before at here, and treat the probability of "Next Token Prediction" as prediction of class.
# No=1307, Yes=6287
class_logit = logit[:, -1, [1307, 6287]]
・Input Text
<bos><start_of_turn>user
Consider whether this user would recommend the product based on the reviews.
Answer Yes/No
# Review:
Age: {{age}}
Title: {{title}}
{{review_text}}
<start_of_turn>model
Answer: {{label}}
4. Extracted Notebook
Score: 0.976/0.9745/0.9755(cv/public/private)
This notebook has been stripped down to the essential parts and is designed to run on Colab (L4).
5. Summary
It's a simple and great solution.
Discussion