iTranslated by AI
Data Analysis of 'Seme' and 'Uke' Stereotypes Among BL Voice Actors
About this article
I came up with the idea of doing something similar to Voice Actor Statistics, so I decided to analyze the trends between voice actors and the stereotypes of the characters they play. As a personal impression, I had a stereotype that, for example, "characters played by Eri Kitamura tend to be relatively flat-chested." I wondered if the characters played by a voice actor might show biases in personality or appearance depending on the "signature roles" (persona) they are known for. I also thought it would be interesting if I could verify such things through data.
However, creating a mapping table between female voice actors and the physical appearance of the characters they play seemed extremely tedious, so I will not be doing that in this article. Instead, as an attempt based on a similar idea, I found that I could likely obtain data to check whether BL (Boys' Love) voice actors and their characters were the "Seme" (top) or the "Uke" (bottom), so I decided to analyze that instead.
Data Collection
BL is a well-known major genre, so naturally, there are specialized information sites. For this article, I decided to use the information published on a site called "Chil-Chil."
I obtained data from a page within this site called Voice Actor List | BL Information Site Chil-Chil. This page apparently allows you to search for voice actors by name or so-called "couplings," and information such as "Voice Actor Name," "Number of Works," "Seme Works," and "Uke Works" for each voice actor is listed in a table format (I haven't tried the coupling search as it seems to require signing up).
To perform the analysis I wanted to do in this article, I scraped and extracted the above four items from this page.
if (!file.exists("./chil-chil_anchor_list.csv")) {
base_url <- "https://www.chil-chil.net/"
session <- polite::bow(base_url)
tables <- purrr::map_dfr(1:21, function(page) {
#### As of 2020/09/25, there are up to 21 pages ####
session %>%
polite::nod(path = paste(c("voiceList", "page", page, "sort", 1L), sep = "/")) %>%
polite::scrape(content = "text/html; charset=UTF-8") %>%
rvest::html_nodes("#anchor_list") %>%
rvest::html_nodes(".c-list_artist") %>%
rvest::html_table() %>%
purrr::map_dfr(~ .) %>%
dplyr::mutate(source_page = page)
})
}
We get a data frame in the following format.
if (!file.exists("./chil-chil_anchor_list.csv")) {
tbl <- tables %>%
dplyr::select(-c("HP", "Twitter")) %>%
dplyr::rename(
name = "声優名",
counts = "作品数",
seme = "攻め作品",
uke = "受け作品"
) %>%
dplyr::mutate(name = stringr::str_remove(name, "([[:alpha:]]+)")) %>%
readr::write_csv("chil-chil_anchor_list.csv")
} else {
tbl <- readr::read_csv("chil-chil_anchor_list.csv")
}
tbl %>%
head() %>%
knitr::kable()
| name | counts | seme | uke | source_page |
|---|---|---|---|---|
| 森川智之 | 670 | 312 | 75 | 1 |
| 遊佐浩二 | 333 | 104 | 70 | 1 |
| 鳥海浩輔 | 442 | 128 | 95 | 1 |
| 平川大輔 | 336 | 103 | 91 | 1 |
| 羽多野渉 | 301 | 152 | 21 | 1 |
| 鈴木達央 | 171 | 35 | 58 | 1 |
As of September 25, 2020, there were 2,015 voice actors registered.
nrow(tbl)
## [1] 2015
However, for most voice actors, it seems that only their names are registered. Even if there are registered works they appeared in, it is not necessarily the case that "Seme works" or "Uke works," where they appeared as major characters, are registered. Therefore, the total of registered "Seme works" and "Uke works" does not necessarily match the "Number of Works" listed here.
tbl %>%
dplyr::select(-name) %>%
summary()
## counts seme uke source_page
## Min. : 0.00 Min. : 0.000 Min. : 0.000 Min. : 1.00
## 1st Qu.: 1.00 1st Qu.: 0.000 1st Qu.: 0.000 1st Qu.: 6.00
## Median : 2.00 Median : 0.000 Median : 0.000 Median :11.00
## Mean : 11.79 Mean : 2.195 Mean : 2.093 Mean :10.58
## 3rd Qu.: 7.00 3rd Qu.: 0.000 3rd Qu.: 0.000 3rd Qu.:16.00
## Max. :670.00 Max. :312.000 Max. :200.000 Max. :21.00
Out of the 2,015 voice actors, 173 had registrations for both "Seme works" and "Uke works" (i.e., at least 1 in each column). The following analysis focuses only on these voice actors with registrations for both.
tbl %>%
dplyr::filter(seme > 0, uke > 0) %>%
nrow()
## [1] 173
Visualization with Graphs
Status of Seme/Uke Registration Counts
Visualizing with "Seme works" and "Uke works" as axes results in the following:
tbl %>%
dplyr::filter(seme > 0, uke > 0) %>%
dplyr::mutate(binned_rank = dplyr::ntile(counts, 6L)) %>%
dplyr::mutate(binned_rank = as.factor(binned_rank)) %>%
ggplot2::ggplot(aes(x = seme, y = uke, label = name, colour = binned_rank)) +
ggplot2::geom_point() +
ggplot2::xlab("Seme") +
ggplot2::ylab("Uke") +
ggplot2::ggtitle("Registration Count of Seme/Uke A") +
ggplot2::theme_light() +
ggplot2::scale_colour_manual(values = viridisLite::plasma(6L))

The following figure shows an example where the axes have been transformed.
tbl %>%
dplyr::filter(seme > 0, uke > 0) %>%
dplyr::mutate(binned_rank = dplyr::ntile(counts, 6L)) %>%
dplyr::mutate(binned_rank = as.factor(binned_rank)) %>%
ggplot2::ggplot(aes(x = seme, y = uke, label = name, colour = binned_rank)) +
ggplot2::geom_point() +
ggplot2::scale_x_sqrt() +
ggplot2::scale_y_sqrt() +
ggplot2::xlab("Seme") +
ggplot2::ylab("Uke") +
ggplot2::ggtitle("Registration Count of Seme/Uke B") +
ggplot2::theme_light() +
ggplot2::scale_colour_manual(values = viridisLite::plasma(6L))

Below, only the group with a binning of 5 or higher in registration counts is extracted and shown with names.
We can see that Toshiyuki Morikawa and Hikaru Midorikawa are located in quite outlying positions. For reference, I will quote their "areas of expertise" from their introduction blurbs on their individual pages in "Chil-Chil."
Toshiyuki Morikawa
Expertise: Super Seme. Ultra Seme.
Hikaru Midorikawa
Expertise: Super Uke. Ultra Uke. Beautiful Uke. Puri-voice Uke. A model for moaning.
Additionally, Morikawa is apparently "called the Emperor of the BL voice acting world."
tbl %>%
dplyr::filter(seme > 0, uke > 0) %>%
dplyr::mutate(binned_rank = dplyr::ntile(counts, 6L)) %>%
dplyr::filter(binned_rank > 4) %>%
ggplot2::ggplot(aes(x = seme, y = uke, label = name, alpha = counts)) +
ggplot2::geom_point() +
ggrepel::geom_text_repel() +
ggplot2::scale_x_sqrt() +
ggplot2::scale_y_sqrt() +
ggplot2::xlab("Number of Seme") +
ggplot2::ylab("Number of Uke") +
ggplot2::ggtitle("Registration Count of Seme/Uke C") +
ggplot2::theme_light() +
ggplot2::scale_colour_manual(values = viridisLite::plasma(6L))

Status of Seme/Uke Aptitude
When calculating the ratio of the number of registered "Seme works" to "Uke works" and plotting them on log scales for both axes, they fall onto a neat straight line.
tbl %>%
dplyr::filter(seme > 0, uke > 0) %>%
dplyr::mutate(binned_rank = dplyr::ntile(counts, 6L)) %>%
dplyr::mutate(binned_rank = as.factor(binned_rank)) %>%
dplyr::mutate(SU = seme / uke) %>%
dplyr::mutate(US = uke / seme) %>%
ggplot2::ggplot(aes(x = SU, y = US, label = name, alpha = counts, colour = binned_rank)) +
ggplot2::geom_point() +
ggplot2::scale_x_log10() +
ggplot2::scale_y_log10() +
ggplot2::ggtitle("Seme/Uke Aptitude A") +
ggplot2::xlab("Seme Aptitude") +
ggplot2::ylab("Uke Aptitude") +
ggplot2::theme_light() +
ggplot2::scale_colour_manual(values = viridisLite::plasma(6L))

The figure below shows only the voice actors with a binning of 6 in registration counts, along with their names. It seems that actors like Akira Ishida and Yuki Miyata have high suitability for Uke roles, while others like Katsuyuki Konishi and Junichi Suwabe have high suitability for Seme roles.
tbl %>%
dplyr::filter(seme > 0, uke > 0) %>%
dplyr::mutate(binned_rank = dplyr::ntile(counts, 6L)) %>%
dplyr::filter(binned_rank == 6) %>%
dplyr::mutate(SU = seme / uke) %>%
dplyr::mutate(US = uke / seme) %>%
ggplot2::ggplot(aes(x = SU, y = US, label = name, alpha = counts)) +
ggplot2::geom_point() +
ggrepel::geom_text_repel() +
ggplot2::scale_x_sqrt() +
ggplot2::scale_y_sqrt() +
ggplot2::ggtitle("Seme/Uke Aptitude B") +
ggplot2::xlab("Seme Aptitude") +
ggplot2::ylab("Uke Aptitude") +
ggplot2::theme_light() +
ggplot2::scale_colour_manual(values = viridisLite::plasma(6L))

Summary
Findings
It was found that there seem to be certain trends for each voice actor, such as Akira Ishida having a high Uke aptitude and Katsuyuki Konishi having a high Seme aptitude. Additionally, regarding Toshiyuki Morikawa, who is known as the "Emperor of the BL voice acting world," it was observed that while he has a reputation for specializing in Seme roles, his Seme aptitude is not remarkably high compared to others, likely due in part to the exceptionally high number of his registered works overall.
However, since the author of this article is not familiar with BL at all, I cannot judge whether these results are appropriate when compared with the sensibilities of those who are. Furthermore, the data used for this article was obtained from the "Chil-Chil" website and likely does not cover every BL work in circulation. Therefore, it is important to note that it is unclear whether the data accurately reflects the trends of characters played by BL voice actors.
Other comments
Future Prospects
I hope to see analyses that explore the relationship between voice quality and these trends. Also, although I am a man saying this, most examples of voice actor analysis—including "Voice Actor Statistics"—tend to focus on female voice actor data, and there seem to be almost no examples where male voice actors are handled. I feel that there is a significant gender bias in this field of research, so I hope male voice actor fans will step up at some point.
Session Information
sessioninfo::session_info()
## - Session info ---------------------------------------------------------------
## setting value
## version R version 4.0.2 (2020-06-22)
## os Windows 10 x64
## system x86_64, mingw32
## ui RStudio
## language (EN)
## collate Japanese_Japan.932
## ctype Japanese_Japan.932
## tz Asia/Tokyo
## date 2020-09-25
##
## - Packages -------------------------------------------------------------------
## package * version date lib source
## assertthat 0.2.1 2019-03-21 [1] CRAN (R 4.0.2)
## backports 1.1.10 2020-09-15 [1] CRAN (R 4.0.2)
## blob 1.2.1 2020-01-20 [1] CRAN (R 4.0.2)
## broom 0.7.0 2020-07-09 [1] CRAN (R 4.0.2)
## cellranger 1.1.0 2016-07-27 [1] CRAN (R 4.0.2)
## cli 2.0.2 2020-02-28 [1] CRAN (R 4.0.2)
## colorspace 1.4-1 2019-03-18 [1] CRAN (R 4.0.2)
## crayon 1.3.4 2017-09-16 [1] CRAN (R 4.0.2)
## DBI 1.1.0 2019-12-15 [1] CRAN (R 4.0.2)
## dbplyr 1.4.4 2020-05-27 [1] CRAN (R 4.0.2)
## digest 0.6.25 2020-02-23 [1] CRAN (R 4.0.2)
## dplyr * 1.0.2 2020-08-18 [1] CRAN (R 4.0.2)
## ellipsis 0.3.1 2020-05-15 [1] CRAN (R 4.0.2)
## evaluate 0.14 2019-05-28 [1] CRAN (R 4.0.2)
## fansi 0.4.1 2020-01-08 [1] CRAN (R 4.0.2)
## farver 2.0.3 2020-01-16 [1] CRAN (R 4.0.2)
## forcats * 0.5.0 2020-03-01 [1] CRAN (R 4.0.2)
## fs 1.5.0 2020-07-31 [1] CRAN (R 4.0.2)
## generics 0.0.2 2018-11-29 [1] CRAN (R 4.0.2)
## ggplot2 * 3.3.2 2020-06-19 [1] CRAN (R 4.0.2)
## ggrepel 0.8.2 2020-03-08 [1] CRAN (R 4.0.2)
## glue 1.4.2 2020-08-27 [1] CRAN (R 4.0.2)
## gtable 0.3.0 2019-03-25 [1] CRAN (R 4.0.2)
## haven 2.3.1 2020-06-01 [1] CRAN (R 4.0.2)
## here 0.1 2017-05-28 [1] CRAN (R 4.0.2)
## highr 0.8 2019-03-20 [1] CRAN (R 4.0.2)
## hms 0.5.3 2020-01-08 [1] CRAN (R 4.0.2)
## htmltools 0.5.0 2020-06-16 [1] CRAN (R 4.0.2)
## httr 1.4.2 2020-07-20 [1] CRAN (R 4.0.2)
## jsonlite 1.7.1 2020-09-07 [1] CRAN (R 4.0.2)
## knitr 1.30 2020-09-22 [1] CRAN (R 4.0.2)
## labeling 0.3 2014-08-23 [1] CRAN (R 4.0.0)
## lifecycle 0.2.0 2020-03-06 [1] CRAN (R 4.0.2)
## lubridate 1.7.9 2020-06-08 [1] CRAN (R 4.0.2)
## magrittr 1.5 2014-11-22 [1] CRAN (R 4.0.2)
## memoise 1.1.0 2017-04-21 [1] CRAN (R 4.0.2)
## modelr 0.1.8 2020-05-19 [1] CRAN (R 4.0.2)
## munsell 0.5.0 2018-06-12 [1] CRAN (R 4.0.2)
## pillar 1.4.6 2020-07-10 [1] CRAN (R 4.0.2)
## pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.0.2)
## polite 0.1.1 2019-11-30 [1] CRAN (R 4.0.2)
## purrr * 0.3.4 2020-04-17 [1] CRAN (R 4.0.2)
## R6 2.4.1 2019-11-12 [1] CRAN (R 4.0.2)
## ratelimitr 0.4.1 2018-10-07 [1] CRAN (R 4.0.2)
## Rcpp 1.0.5 2020-07-06 [1] CRAN (R 4.0.2)
## readr * 1.3.1 2018-12-21 [1] CRAN (R 4.0.2)
## readxl 1.3.1 2019-03-13 [1] CRAN (R 4.0.2)
## reprex 0.3.0 2019-05-16 [1] CRAN (R 4.0.2)
## rlang 0.4.7 2020-07-09 [1] CRAN (R 4.0.2)
## rmarkdown 2.3 2020-06-18 [1] CRAN (R 4.0.2)
## robotstxt 0.7.13 2020-09-03 [1] CRAN (R 4.0.2)
## rprojroot 1.3-2 2018-01-03 [1] CRAN (R 4.0.2)
## rstudioapi 0.11 2020-02-07 [1] CRAN (R 4.0.2)
## rvest 0.3.6 2020-07-25 [1] CRAN (R 4.0.2)
## scales 1.1.1 2020-05-11 [1] CRAN (R 4.0.2)
## sessioninfo 1.1.1 2018-11-05 [1] CRAN (R 4.0.2)
## stringi 1.5.3 2020-09-09 [1] CRAN (R 4.0.2)
## stringr * 1.4.0 2019-02-10 [1] CRAN (R 4.0.2)
## tibble * 3.0.3 2020-07-10 [1] CRAN (R 4.0.2)
## tidyr * 1.1.2 2020-08-27 [1] CRAN (R 4.0.2)
## tidyselect 1.1.0 2020-05-11 [1] CRAN (R 4.0.2)
## tidyverse * 1.3.0 2019-11-21 [1] CRAN (R 4.0.2)
## usethis 1.6.3 2020-09-17 [1] CRAN (R 4.0.2)
## vctrs 0.3.4 2020-08-29 [1] CRAN (R 4.0.2)
## viridisLite 0.3.0 2018-02-01 [1] CRAN (R 4.0.2)
## withr 2.3.0 2020-09-22 [1] CRAN (R 4.0.2)
## xfun 0.17 2020-09-09 [1] CRAN (R 4.0.2)
## xml2 1.3.2 2020-04-23 [1] CRAN (R 4.0.2)
## yaml 2.2.1 2020-02-01 [1] CRAN (R 4.0.0)
##
## [1] C:/Users/user/Documents/R/win-library/4.0
## [2] C:/Program Files/R/R-4.0.2/library
Source Code for This Article
Written in R Markdown.
Source code for "Data Analysis of Seme and Uke Stereotypes of BL Voice Actors"
Discussion