🦏

【Atmacup #17】 Take a look at 1st solution

2024/10/03に公開

This time, take a look at the 1st solution of Atmacup #17.
・Solution

https://www.guruguru.science/competitions/24/discussions/21027ff1-2074-4e21-a249-b2d4170bd516/

 0. HighlightsSorry, I'm unfamiliar with LLM.

Chapter 2:

・LoRA with LLM

・Pusedo label

・Decide the weights by Nelder-Mead

Chapter 3:

・LanguageModelHead instead of SequenceClassificationHead

・Instruction Tuning

 1. Competition OverviewIn this competition, participants challenged themselves to predict 'whether the reviewer will recommend the clothes they reviewed.'

 1.1 MetricAUC

 1.2 Data・Review content

・Reviewer's age

・Attribution info of clothes

 2. Solution
 CVStratifiedKFold(4split)

 Model (cv/public/private)gemma-2-9b-it (0.9764/0.9760/0.9793)
LM Head
LoRA
r=8, alpha=16, target_modules=("q_proj", "k_proj", "v_proj", "o_proj", "gate_proj", "up_proj", "down_proj")

learning_rate = 1e-4
1epoch
RTX4090、1fold/2h(about)

deberta-v3-large (0.9740/NA/NA)
gemmaのテストの擬似ラベルをconcatして訓練
training with concat to the pseudo label of gemma test

ensemble (0.9770/0.9763/0.9796)
simple mean
Decided the weights by Nelder-Mead


 3. Training and Inference by LM head of Gemma・Using LanguageModelHead, not SequenceClassificationHead.

・Assign the class label(0/1) to Yes/No token and Instruction Tuning

・When inference, input until before at here, and treat the probability of "Next Token Prediction" as prediction of class.
# No=1307, Yes=6287
class_logit = logit[:, -1, [1307, 6287]]
・Input Text
<bos><start_of_turn>user
Consider whether this user would recommend the product based on the reviews.
Answer Yes/No
# Review:
Age: {{age}}
Title: {{title}}
{{review_text}}

<start_of_turn>model
Answer: {{label}}

 4. Extracted NotebookScore: 0.976/0.9745/0.9755(cv/public/private)
This notebook has been stripped down to the essential parts and is designed to run on Colab (L4).
https://colab.research.google.com/drive/1aYu9SnjeMt9uLtLawlHxdzNk50-29bRa?usp=sharing

 5. SummaryIt's a simple and great solution.

0. Highlights

1. Competition Overview

1.1 Metric

1.2 Data

2. Solution

CV

Model (cv/public/private)

3. Training and Inference by LM head of Gemma

4. Extracted Notebook

5. Summary

Discussion