ãKaggleãBirdCLEF2025 65äœð¥ æ¯ãè¿ã
1. ã¯ããã«
é³¥ã³ã³ãããšBirdCLEF2025ã§éã¡ãã«ãç²åŸããããšãã§ããŸããïŒããŒãã«ããŒã¿ä»¥å€ã®ã³ã³ãã«ã¯åå ããããšããªãã£ãã®ã§åãããªãããšã ããã§ãããïŒChatGPTãšå 人ã®ç¥æµãåããŠïŒãªããšãç®æšãšããŠããéã¡ãã«ãç²åŸããããšãã§ããŸããïŒä»¥äžã«åå äœéšèšãæžããŸãïŒ
2. ã³ã³ãã®æŠèŠ
ç±åž¯éšæã«æ®ããéçåç©ãã¡ã¯ãç°å¢ã®å€åã«ãšãŠãææã§ã.圌ãã®ã鳎ã声ãããé³ãã¯ãçæ
ç³»ã®ç¶æ
ãç¥ãããã®å€§åãªæãããã«ãªããŸã.ãããã人éãåºãæ£®ãæ©ããŠèª¿æ»ããã®ã¯ãšãŠã倧å€ã§ãæéããéãããããŸã.
ããã§ä»åã®Kaggleã³ã³ãã§ã¯ããé³å£°ããŒã¿ã䜿ã£ãŠãã©ããªåç©ãããããAIã§å€å¥ãããããšã«ææŠããŸã.察象ã¯é³¥ã ãã§ãªããã«ãšã«ãåºä¹³é¡ãæè«ãªã©ãããŸããŸãªçãç©ã§ã.
ãã®æè¡ãããŸãããã°ã森ã®ååŸ©ç¶æ³ãé éã§ã¢ãã¿ãªã³ã°ã§ããããã«ãªããŸã.çŸå°ã®äººãã¡ãç ç©¶è
ããã©ãã«ã©ããªåç©ãæ»ã£ãŠããŠããã®ããç¥ãããšãã§ããä¿å
šæŽ»åã®å¹æãèŠããåã§ããã®ã§ã.äŸå¹Žãšç°ãªãç¹ã¯é³¥ã ãã§ã¯ãªããåºä¹³é¡ãæè«ãªã©ã®é³Žã声ã远å ãããããšã§ãïŒ
3. è§£æ³
æ¬ã³ã³ãã§ã¯ã以äžã®ãã€ãã©ã€ã³ã§é³å£°ããŒã¿ã®ååŠçããã³ã¢ãã«ã®åŠç¿ãè¡ããŸããïŒ
é³å£°ããŒã¿ âã人å£é³å£°ã®é€å» â5secã®ã¯ãªããã³ã°(ã©ã³ãã ïŒâ ã¡ã«ã¹ãã¯ããã°ã©ã ç»åãžã®å€æ â ã¢ãã«åŠç¿
åŠç¿çšé³å£°ããŒã¿ã®äžéšã«ã¯ãåç©ã®ååãèªã¿äžãã人工é³å£°ãå«ãŸããŠããŸãã.ããã«ããã¢ãã«ãå¯èœæ§ããããããéå»ã®KaggleããŒãããã¯ãåèã«ããªããã人éã®å£°æåãé€å»ããååŠçãè¡ããŸãã.
åŠç¿çšã®é³å£°ããŒã¿ã®é·ãã¯æ§ã
ã§5ç§æªæºã®ãã®ãããã°ïŒ1å以äžã®ãã®ããããŸããïŒããŒã¿ã®ãµã€ãºãçµ±äžããããäžéšãåãåãå¿
èŠããããŸããïŒåãåãæ¹æ³ã¯è€æ°ãããŸãïŒ
ã»æåãåãåãæ¹æ³
ã»çãäžãåãåãæ¹æ³
ã»ã©ã³ãã ã«åãåã
ã»RMS(Root Mean SquareïŒé³ã®ãšãã«ã®ãŒéã倧ããå Žæããåªå
çã«åãåãæ¹æ³ïŒ
æçµçã«ç§ã¯ã©ã³ãã ã«åãåãæ¹æ³ãéžã³ãŸããïŒã·ãŒããå€ããŠæ€èšŒãè¡ã£ãçµæïŒRMSã§ã¯è¥å¹²ã®ç²ŸåºŠæ¹å(+0.02)ãèŠãããŸãããïŒäººå£é³å£°ãé€å»ããåŸã«ãã®åŠçãå¿
èŠãªã®ããšããçåãš
PublicLBã«ãªãŒããŒãã£ããã£ã³ã°ããããªäºæãããã®ã§ïŒæ±åæ§èœãåªå
ããŠã©ã³ãã ã«ãµã³ããªã³ã°ããŸããïŒ
æçµçã«ã¯ä»¥äžãããã¯ããŒã³ãšããã¢ãã«ã®ã¢ã³ãµã³ãã«ã§æåºããŸããïŒ
ã»efficientnet_b0(simple CNN)
ã»eca_nfnet_l0ïŒSED)
4. äžæããã£ãããš
ã»ã¢ãã«ã¹ãŒãïŒModel SoupïŒ
ã¢ãã«ã¹ãŒããšã¯ãè€æ°ã®ã¢ãã«ã®éã¿ãåçŽå¹³åããããšã§1ã€ã®æ°ããã¢ãã«ãäœãææ³ã§ãhttps://arxiv.org/abs/2203.05482
ããã¯ã¢ã³ãµã³ãã«ã®ããã«è€æ°ã¢ãã«ãåæã«æšè«ããããã§ã¯ãªããããæšè«æéãå¢ãããã«ã¢ãã«ã®å€æ§æ§ãåãå
¥ããããã®ãç¹åŸŽã§ã
ãã®ææ³ãä»åã®ã³ã³ãã§éåžžã«å¹æãçºæ®ããŸãã
ãã®ã³ã³ãã§ã¯äŸå¹Žéããã¡ã€ã³ã·ããã倧ãããé©åãªããªããŒã·ã§ã³èšèšãå°é£ãšãã課é¡ããããŸãããã®ãããç¹å®ã®ãšããã¯ã»ã·ãŒãã«äŸåããã¢ãã«ã«ããã®ã§ã¯ãªããã·ãŒã0ã4, ãšããã¯1ã20ã®ã¢ãã«ã®éã¿ãå¹³åããããšã§ãéåŠç¿ãé¿ãã€ã€æ±åæ§èœã®åäžãçããŸãã
çµæãšããŠãã¢ãã«ã¹ãŒãã«ãã£ãŠç°ãªããšããã¯ã§åŠç¿ããã倿§ãªç¹åŸŽãåã蟌ãããšãã§ãããšèããŠããŸã
ã»SEDã¢ãã«
ããã¯ãã£ã¹ã«ãã·ã§ã³ãéå»ã³ã³ãããæå¹ã§ãããšããã£ãŠããŸãããå®è£
ãé£ãããã ã£ãã®ã§æŸçœ®ããŠããŸããïŒïŒïŒé 匵ã£ãŠçè§£ããŠå®è£
ãŸã§ãããŸããïŒïŒãã¡ããéå»ã³ã³ãã¯åèã«ããŠããŸãïŒ
ã»ONNXãšopenvinoã«ããæšè«ã®é«éå
æšè«æéã40%éããªãã¿ããã§ãïŒãããïŒbackboneãNF_NETã ãšããŸã倿ã§ããŸããã§ããïŒ
ã»Labelsmoothing
5. äžæããããªãã£ãããš
ã»ããªããŒã·ã§ã³ææ³ã®ç¢ºç«
ãã®ã³ã³ãã§ã¯ïŒåçš®ã«ããããµã³ãã«æ°ã®åããéåžžã«å€§ãããããªããŒã·ã§ã³æŠç¥ã®èšèšã«å€§ããèŠæŠããŸããïŒ
ããšãã°ïŒæãå€ãçš®ã§ã¯990åã®ãµã³ãã«ãããäžæ¹ã§ïŒæãå°ãªãçš®ã§ã¯ããã2åãããããŸããïŒãã®ãããªæ¥µç«¯ãªã¯ã©ã¹äžåè¡¡ã®ãããéåžžã®ã¯ãã¹ããªããŒã·ã§ã³ïŒStratified K-Foldãªã©ïŒã§ã®åå²ãè¡ãããšãã§ããŸããã§ããïŒ
ãã®åé¡ã®è§£æ±ºã«ã¯ããªãã®æéãè²»ãããŸãããïŒæ®å¿µãªããæçµçãªææã«ã¯ã€ãªãããŸããã§ãã
詊ããã¢ãããŒããšãã®åç
- å°æ°ã¯ã©ã¹ã«å¯ŸããããŒã¿æ¡åŒµ
ãã€ãºã®ä»å ããããã·ãããªã©ãé³å£°ããŒã¿ã«å¯ŸããŠå€æãå ããããšã§ããŒã¿ã®æ°Žå¢ãã詊ã¿ãŸãã - å°æ°ã¯ã©ã¹ã®ã¿ã«å¯Ÿãããµã³ããªã³ã°åæ°ã®å¢å
ãã ãããã®æ¹æ³ã§ã¯çµå±â䌌ããããªç¹åŸŽãæã€ããŒã¿âãç¹°ãè¿ãåŠç¿ãããŠããŸããæå³ãèãã£ãå¯èœæ§ãé«ãã§ã - å°æ°ã¯ã©ã¹ã®è¿œå ããŒã¿æå
¥
åfoldã«å šãŠã®çš®ãå«ãŸãããããå°æ°ã¯ã©ã¹ã«éã£ãŠè¿œå ããŒã¿ãæå ¥ããŸããããããªããŒã·ã§ã³ã¹ã³ã¢ãšLBã¹ã³ã¢ã®ä¹é¢ãèŠãããŸãã
â 远å ããŒã¿ã®è³ªãäœãã£ãããšãäž»ãªåå ãããããŸãã
2äœã®ãœãªã¥ãŒã·ã§ã³ã§ããããªããŒã·ã§ã³ã®é£ããã«ã€ããŠã¯è§ŠããããŠããŸããç¹ã«ãAUCã®æ¹åã1%æªæºã®ã¬ãã«ã«ãªããšãCVã¹ã³ã¢ãšLBã¹ã³ã¢ã®çžé¢ãã»ãšãã©ãªããªããšããææã¯ãèªåã®è©Šè¡é¯èª€ãšãäžèŽããŠããŸãã
âAfter some major improvements, we saw significant positive changes in both metrics, but when tweaking things within ~1% AUC, the correlation was nearly absent.â
â 2nd Place Discussion
- æ¬äŒŒã©ããªã³ã°
ãéŸå€ã®èšå®ãå€ããŠè²ã 詊ããŸãããïŒããŸããããŸããã§ããïŒ - Time Flip
ãé³¥ã®é³Žãå£°ã¯æéæ§é ãéèŠã§ïŒæéãå転ãããŠããŸããšïŒç¹åŸŽãããŸãåããããªãããã§ãïŒ - æšè«æã«ååŸã®2.5ç§ãå«ãã10ç§ã§æšè«
6. äžäœå ¥è³è ã®è§£æ³
- ãµã³ããªã³ã°æ¹æ³
äžäœå
¥è³è
ã®å€ãã«å
±éããŠèŠããããµã³ããªã³ã°æŠç¥ã¯ã
åãåãç§æ°ïŒ5ç§ãŸãã¯10ç§ã
æœåºæ¹æ³ïŒã©ã³ãã ãµã³ããªã³ã°ãRMSïŒRoot Mean SquareïŒã«ãããšãã«ã®ãŒããŒã¹ã®æœåº
ãšãããã®ã§ãã
ç§èªèº«ãé·ãïŒ10ç§,15ç§ïŒã®åºéãåãåãã¢ãããŒãã¯è©ŠããŠããã®ã§ãããLBã®ã¹ã³ã¢ã«ã¯ããŸãå€åãèŠããããæçµçã«ã¯5ç§éã®ã©ã³ãã ãµã³ããªã³ã°ã«èœã¡çããŸãã
æ¬æ¥ãé³¥ã®é³Žã声ã¯5ç§ä»¥äžç¶ãããšãå€ããããã®ã§ããã鳎ããŠããªãéšåããã€ãºã«ãªãæžå¿µããåŠçé床ãšã®ãã©ã³ã¹ãèæ
®ããŠããã®ã¢ã€ãã¢ã¯ãã£ããèŠéã倿ãããŸãããã ãæ¹ããŠæ€èšŒã®äŸ¡å€ã¯ãããšæããŠããŸã
ç¹ã«å°è±¡çã ã£ãã®ã¯ã2äœã®ãœãªã¥ãŒã·ã§ã³ã§ç޹ä»ãããŠãã以äžã®æ¹æ³ã§ã
ãé³å£°ã®æåãŸãã¯æåŸã®7ç§éãããã©ã³ãã ã«5ç§ãæœåºããã
ãã®ææ³ã¯ããé³¥ã鳎ããŠãããšãã«é²é³ãã¿ã³ãæŒãã鳎ãçµãã£ããæ¢ããããšãã人éã®è¡åãã¿ãŒã³ã«åºã¥ãããã®ã§ã人éã®çŽæãšå®ããŒã¿ã®æ§é ãçµã³ã€ããåªããçºæ³ã ãšæããŸããããããèŠç¹ããããã圢ã«ããå®è£ åã¯ä»åŸãã²èº«ã«ã€ããŠããããã§ã
"We trained our models on 5s and randomly selected segments. Based on experience from last year, we initially tried three approaches to picking 5s segments: random 5s from the whole audio, random 5s from the first 7s, and random 5s from the first or last 7s. The reasoning for the last two approaches is that often the recorder starts the audio when the animal is vocalizing and stops it when the animal stops vocalizing. That would help the model avoid false positives. The 7s was just to add some diversity."
â 2nd Place Discussion
7. ææ³
ãã®ã³ã³ãã§ã®è¡åç®æšã¯ä»¥äžã®3ã€ã§ããïŒ
ã»GPUãåããç¶ãã
ã»äžéãŸã§submitãã
ã»Discussionãèªã¿èŸŒãïŒ
äž2ã€ã¯ã©ããã®ããã°ã§æèŠãããã®ã§ããïŒéãããªãå®è£
åã身ã«ã€ãããšããç¹ã§ã¯å€§ãã«åœ¹ç«ã¡ãŸããïŒãŸãïŒæåŸã®Discussionãèªã¿èŸŒãããšãéåžžã«å€§åã§ããïŒ
å¹³æ¥ã¯åž°å®
åŸã®çŽ3æéã忥ã¯ã»ãŒãã¹ãŠã®æéããã®ã³ã³ãã«æ³šãã§ããã®ã§ãæãããã²ãšããã§ãäžäœå
¥è³è
ã®ãœãªã¥ãŒã·ã§ã³ãèŠãŠããéãããªããã ãã§ã¯å€ªåæã¡ã§ããªãå£ãããããšãçæããŸãã
ããã§ãããã®çµéšã¯èªåã«ãšã£ãŠéåžžã«å€§ããªç³§ãšãªããŸããæ¥å¹Žã¯ãå¿
ãéã¡ãã«ãç²ãã«ãããŸã
Discussion