Nov 26, 2024

create journal impact factor table for target journal selection

setup

i came across a spreadsheet of the 2023 version of Journal Impact Factor (JIF). a quick look in Excel showed that the JIFs appeared to decrease from 500+ to a long stretch of 1.6s that seemed to go on forever. so i'll need to check if this spreadsheet's complete or correct, and if both yes, i'll make a suitable copy of it for picking candidate journals for reserach articles.

solution

the plan

load the JIF spreadsheet in R.
check there's something obviously wrong with the JIFs and number of journals.
if the source table seems trustworthy enough, then a. pick journal by field. fields w/ a shred of possibility of publishing something biomedical or clinical are included. b. sort journals by descending JIF.
write the resulting simpler JIF table to a new sheet in the source doc.

programming language & module(s):

R
tidyverse
readxl
openxlsx

input docs

can be found in here: https://discuss.sci-hub.org.cn/d/2605

variables to customize:

folder and jif_name, for determining JIF spreadsheet doc path

the script:

sort_out_JIFs.R

 rm(list = ls())
for (pkg in c('tidyverse', 'readxl', 'openxlsx')){
  if(!requireNamespace(pkg, quietly = T)){
    install.packages(pkg, dependencies = T)
  }
  library(pkg, character.only = T)
}

folder <- 'c:/users/xiao/desktop'
jif_name <- 'JIF.2024.xlsx'
jif_path <- file.path(folder, jif_name)
df_input <- read_excel(jif_path)
df_input |> glimpse
# 21800+ journals are included in the 2024 JCR according to official website 
# (see https://clarivate.com.cn/2024/06/20/2024jcr/). so the total journal number
# matches.
df_input |> 
  count(JIF) |> 
  arrange(desc(JIF)) |> 
  print(n = df_input |>
  nrow)
# 13 JIF == 'N/A's and 683 '<0.1's.
print(683 + 13)
df_input <- df_input |> 
  select(c(1, 2, 5, 6, 7)) |> 
  mutate(JIF = JIF |> as.numeric)
df_input |> head
df_input |> 
  count(JIF) |> 
  arrange(desc(JIF)) |> 
  print(n = df_input |> nrow)
# so the 696 NAs resulted form the <0.1 and N/A.
df_input <- df_input |> 
  mutate(
    Category_level_1 = Category |> 
      str_split('\\|') |> 
      sapply(function(x) x[[1]]))

# pick fields to keep in the simplified JIF table.
field_level_1_full <- df_input |> 
  distinct(Category_level_1) |> 
  pull(Category_level_1) |> 
  sort
field_level_1_full
field_level_1_keep <- field_level_1_full[
  c(
    2,3,4,5,7,8,9,1011,12,14,16,20:26,29:38,40,51,52,55,56,57,58,59,60,65,66,
    67,83,89,94:97,103,104,106,108,115,116,118,121,138,139,152:157,161,162,
    165,167:170,172:174,176:178,180,181,184:188,198,199,207,219,224,225,226,
    239,243,247,249,250,251,254
  )
]
field_level_1_keep |> head
df_jif <- df_input |> 
  filter(Category_level_1 %in% field_level_1_keep) |> 
  mutate(
    # make Category_level_1 more readable
    Category_level_1 = Category_level_1 |> str_to_lower, 
    # convert to numbers in case i'd want to sort by this col in excel
    JIF5Years = JIF5Years |> as.numeric
  ) |> 
  arrange(desc(JIF), Name)
df_jif |> head
# looking good.

# export to a new sheet in the spreadsheet src doc.
jif_xlsx <- loadWorkbook(jif_path)
out_sheet_name <- 'JIF_processed'
if (out_sheet_name %in% names(jif_xlsx)){
  jif_xlsx |> removeWorksheet(out_sheet_name)
}
jif_xlsx |> addWorksheet(out_sheet_name)
jif_xlsx |> writeData(sheet = out_sheet_name, x = df_jif)
# set xlsx doc font while i'm at it ...
jif_xlsx |> modifyBaseFont(
  fontSize = 11, 
  fontColour = 'black', 
  fontName = 'Times New Roman')
jif_xlsx |> saveWorkbook(jif_path, overwrite = T)

output

the simplied JIF table can be found in the sheet title 'JIF_processed' in the source JIF spreadsheet doc. it should look like this:

note to self

the apply function family's super handy. always think of them when feeiling an itch for the for loop.
there may be somthing odd about using | in regex, since 'a|b' |> str_split('|') returns a weird list of [[1]] [1] "" "a" "|" "b" "" instead of [[1]] [1] "a" "b". for the latter, use str_split('\\|').

R publication impact factor

Xiao

Some rights reserved

Except where otherwise noted, content on this page is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.