The data was gotten from WHO(world health organisation) http://www.who.int. The data was complex as three tables had to be joined together to be able take advantage of the full data set with several analysis carried out on the cases of tuberculosis, HIV-Tuberculosis cases and funding of Tuberculosis care all over the world. ** new- new cases ** sp - smear positive ** sn - smear negative ** ep - extra-plumotary ** f - female ** m - male ** u - unknown ** whoregion -
Notes: Setting up my R environment by loading needed packages like tidyverse, ggplot2, dpylr, rmarkdown, forcats, magrittr, readr, stringr, tibble and tidyr.
library(dplyr)
##
## Attaching package: 'dplyr'
## The following objects are masked from 'package:stats':
##
## filter, lag
## The following objects are masked from 'package:base':
##
## intersect, setdiff, setequal, union
library(tidyverse)
## ── Attaching packages
## ───────────────────────────────────────
## tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ stringr 1.4.0
## ✔ tidyr 1.2.0 ✔ forcats 0.5.2
## ✔ readr 2.1.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(tidyr)
library(tibble)
library(stringr)
library(readr)
library(magrittr)
##
## Attaching package: 'magrittr'
##
## The following object is masked from 'package:purrr':
##
## set_names
##
## The following object is masked from 'package:tidyr':
##
## extract
library(lubridate)
##
## Attaching package: 'lubridate'
##
## The following objects are masked from 'package:base':
##
## date, intersect, setdiff, union
library(ggplot2)
library(forcats)
library(knitr)
Notes: The file path was checked and initiated using the get function. The three different files were loaded into R and they where later joined together into a table for ease of use.
#To print the working directory file path
getwd()
## [1] "C:/Users/dell/Documents"
# to comfirm the existence of our file in the right path
file.exists("Excel/TB_notifications_2022-08-19.csv")
## [1] TRUE
file.exists("Excel/TB_outcomes_2022-08-20.csv")
## [1] TRUE
file.exists("Excel/TB_burden_countries_2022-08-19.csv")
## [1] TRUE
file.exists("Excel/TB_expenditure_utilisation_2022-08-19.csv")
## [1] TRUE
#with the help of the readr package, reading a csv file using the read_csv command
tuberculosis <- read_csv("Excel/TB_notifications_2022-08-19.csv")
## Rows: 8707 Columns: 198
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): country, iso2, iso3, iso_numeric, g_whoregion
## dbl (193): year, new_sp, new_sn, new_su, new_ep, new_oth, ret_rel, ret_taf, ...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
tbhiv <- read_csv("Excel/TB_outcomes_2022-08-20.csv")
## Rows: 5539 Columns: 84
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): country, iso2, iso3, iso_numeric, g_whoregion
## dbl (79): year, rep_meth, new_sp_coh, new_sp_cur, new_sp_cmplt, new_sp_died,...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
funding <- read_csv("Excel/TB_expenditure_utilisation_2022-08-19.csv")
## Rows: 860 Columns: 46
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): country, iso2, iso3, iso_numeric, g_whoregion
## dbl (41): year, exp_cpp_dstb, exp_cpp_mdr, exp_cpp_xdr, exp_cpp_tpt, exp_lab...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
estimates <- read_csv("Excel/TB_burden_countries_2022-08-19.csv")
## Rows: 4487 Columns: 50
## ── Column specification ────────────────────────────────────────────────────────
## Delimiter: ","
## chr (5): country, iso2, iso3, iso_numeric, g_whoregion
## dbl (45): year, e_pop_num, e_inc_100k, e_inc_100k_lo, e_inc_100k_hi, e_inc_n...
##
## ℹ Use `spec()` to retrieve the full column specification for this data.
## ℹ Specify the column types or set `show_col_types = FALSE` to quiet this message.
#combining the four dataset
tuberculosis_join <- tuberculosis %>%
full_join(tbhiv, by = c("country", "year")) %>%
full_join(estimates, by = c('country', "year")) %>%
full_join(funding, by = c('country', "year"))
** Getting an overview of the data
# printing the structure of the data to have an overview of how the data looks like
str(tuberculosis)
## spec_tbl_df [8,707 × 198] (S3: spec_tbl_df/tbl_df/tbl/data.frame)
## $ country : chr [1:8707] "Afghanistan" "Afghanistan" "Afghanistan" "Afghanistan" ...
## $ iso2 : chr [1:8707] "AF" "AF" "AF" "AF" ...
## $ iso3 : chr [1:8707] "AFG" "AFG" "AFG" "AFG" ...
## $ iso_numeric : chr [1:8707] "004" "004" "004" "004" ...
## $ g_whoregion : chr [1:8707] "EMR" "EMR" "EMR" "EMR" ...
## $ year : num [1:8707] 1980 1981 1982 1983 1984 ...
## $ new_sp : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_su : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_oth : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_rel : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_taf : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_tad : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_oth : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ newret_oth : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_labconf : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_clindx : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_rel_labconf : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_rel_clindx : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_rel_ep : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ ret_nrel : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ notif_foreign : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ c_newinc : num [1:8707] 71685 71554 41752 52502 18784 ...
## $ new_sp_m04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m1524 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m2534 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m3544 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m4554 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m5564 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_m65 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_mu : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f1524 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f2534 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f3544 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f4554 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f5564 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_f65 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sp_fu : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m1524 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m2534 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m3544 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m4554 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m5564 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m65 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_m15plus : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_mu : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f1524 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f2534 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f3544 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f4554 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f5564 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f65 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_f15plus : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_fu : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_sexunk04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_sexunk514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_sexunk014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_sn_sexunk15plus : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m1524 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m2534 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m3544 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m4554 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m5564 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m65 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_m15plus : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_mu : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f1524 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f2534 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f3544 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f4554 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f5564 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f65 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_f15plus : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_fu : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_sexunk04 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_sexunk514 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_sexunk014 : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_sexunk15plus : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ new_ep_sexunkageunk : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ rel_in_agesex_flg : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## $ agegroup_option : num [1:8707] NA NA NA NA NA NA NA NA NA NA ...
## [list output truncated]
## - attr(*, "spec")=
## .. cols(
## .. country = col_character(),
## .. iso2 = col_character(),
## .. iso3 = col_character(),
## .. iso_numeric = col_character(),
## .. g_whoregion = col_character(),
## .. year = col_double(),
## .. new_sp = col_double(),
## .. new_sn = col_double(),
## .. new_su = col_double(),
## .. new_ep = col_double(),
## .. new_oth = col_double(),
## .. ret_rel = col_double(),
## .. ret_taf = col_double(),
## .. ret_tad = col_double(),
## .. ret_oth = col_double(),
## .. newret_oth = col_double(),
## .. new_labconf = col_double(),
## .. new_clindx = col_double(),
## .. ret_rel_labconf = col_double(),
## .. ret_rel_clindx = col_double(),
## .. ret_rel_ep = col_double(),
## .. ret_nrel = col_double(),
## .. notif_foreign = col_double(),
## .. c_newinc = col_double(),
## .. new_sp_m04 = col_double(),
## .. new_sp_m514 = col_double(),
## .. new_sp_m014 = col_double(),
## .. new_sp_m1524 = col_double(),
## .. new_sp_m2534 = col_double(),
## .. new_sp_m3544 = col_double(),
## .. new_sp_m4554 = col_double(),
## .. new_sp_m5564 = col_double(),
## .. new_sp_m65 = col_double(),
## .. new_sp_mu = col_double(),
## .. new_sp_f04 = col_double(),
## .. new_sp_f514 = col_double(),
## .. new_sp_f014 = col_double(),
## .. new_sp_f1524 = col_double(),
## .. new_sp_f2534 = col_double(),
## .. new_sp_f3544 = col_double(),
## .. new_sp_f4554 = col_double(),
## .. new_sp_f5564 = col_double(),
## .. new_sp_f65 = col_double(),
## .. new_sp_fu = col_double(),
## .. new_sn_m04 = col_double(),
## .. new_sn_m514 = col_double(),
## .. new_sn_m014 = col_double(),
## .. new_sn_m1524 = col_double(),
## .. new_sn_m2534 = col_double(),
## .. new_sn_m3544 = col_double(),
## .. new_sn_m4554 = col_double(),
## .. new_sn_m5564 = col_double(),
## .. new_sn_m65 = col_double(),
## .. new_sn_m15plus = col_double(),
## .. new_sn_mu = col_double(),
## .. new_sn_f04 = col_double(),
## .. new_sn_f514 = col_double(),
## .. new_sn_f014 = col_double(),
## .. new_sn_f1524 = col_double(),
## .. new_sn_f2534 = col_double(),
## .. new_sn_f3544 = col_double(),
## .. new_sn_f4554 = col_double(),
## .. new_sn_f5564 = col_double(),
## .. new_sn_f65 = col_double(),
## .. new_sn_f15plus = col_double(),
## .. new_sn_fu = col_double(),
## .. new_sn_sexunk04 = col_double(),
## .. new_sn_sexunk514 = col_double(),
## .. new_sn_sexunk014 = col_double(),
## .. new_sn_sexunk15plus = col_double(),
## .. new_ep_m04 = col_double(),
## .. new_ep_m514 = col_double(),
## .. new_ep_m014 = col_double(),
## .. new_ep_m1524 = col_double(),
## .. new_ep_m2534 = col_double(),
## .. new_ep_m3544 = col_double(),
## .. new_ep_m4554 = col_double(),
## .. new_ep_m5564 = col_double(),
## .. new_ep_m65 = col_double(),
## .. new_ep_m15plus = col_double(),
## .. new_ep_mu = col_double(),
## .. new_ep_f04 = col_double(),
## .. new_ep_f514 = col_double(),
## .. new_ep_f014 = col_double(),
## .. new_ep_f1524 = col_double(),
## .. new_ep_f2534 = col_double(),
## .. new_ep_f3544 = col_double(),
## .. new_ep_f4554 = col_double(),
## .. new_ep_f5564 = col_double(),
## .. new_ep_f65 = col_double(),
## .. new_ep_f15plus = col_double(),
## .. new_ep_fu = col_double(),
## .. new_ep_sexunk04 = col_double(),
## .. new_ep_sexunk514 = col_double(),
## .. new_ep_sexunk014 = col_double(),
## .. new_ep_sexunk15plus = col_double(),
## .. new_ep_sexunkageunk = col_double(),
## .. rel_in_agesex_flg = col_double(),
## .. agegroup_option = col_double(),
## .. newrel_m04 = col_double(),
## .. newrel_m59 = col_double(),
## .. newrel_m1014 = col_double(),
## .. newrel_m514 = col_double(),
## .. newrel_m014 = col_double(),
## .. newrel_m1519 = col_double(),
## .. newrel_m2024 = col_double(),
## .. newrel_m1524 = col_double(),
## .. newrel_m2534 = col_double(),
## .. newrel_m3544 = col_double(),
## .. newrel_m4554 = col_double(),
## .. newrel_m5564 = col_double(),
## .. newrel_m65 = col_double(),
## .. newrel_m15plus = col_double(),
## .. newrel_mu = col_double(),
## .. newrel_f04 = col_double(),
## .. newrel_f59 = col_double(),
## .. newrel_f1014 = col_double(),
## .. newrel_f514 = col_double(),
## .. newrel_f014 = col_double(),
## .. newrel_f1519 = col_double(),
## .. newrel_f2024 = col_double(),
## .. newrel_f1524 = col_double(),
## .. newrel_f2534 = col_double(),
## .. newrel_f3544 = col_double(),
## .. newrel_f4554 = col_double(),
## .. newrel_f5564 = col_double(),
## .. newrel_f65 = col_double(),
## .. newrel_f15plus = col_double(),
## .. newrel_fu = col_double(),
## .. newrel_sexunk04 = col_double(),
## .. newrel_sexunk514 = col_double(),
## .. newrel_sexunk014 = col_double(),
## .. newrel_sexunk15plus = col_double(),
## .. newrel_sexunkageunk = col_double(),
## .. rdx_data_available = col_double(),
## .. newinc_rdx = col_double(),
## .. rdxsurvey_newinc = col_double(),
## .. rdxsurvey_newinc_rdx = col_double(),
## .. rdst_new = col_double(),
## .. rdst_ret = col_double(),
## .. rdst_unk = col_double(),
## .. conf_rrmdr = col_double(),
## .. conf_mdr = col_double(),
## .. rr_sldst = col_double(),
## .. all_conf_xdr = col_double(),
## .. conf_rr_nfqr = col_double(),
## .. conf_rr_fqr = col_double(),
## .. unconf_rrmdr_tx = col_double(),
## .. conf_rrmdr_tx = col_double(),
## .. rrmdr_014_tx = col_double(),
## .. unconf_mdr_tx = col_double(),
## .. conf_mdr_tx = col_double(),
## .. conf_xdr_tx = col_double(),
## .. unconf_rr_nfqr_tx = col_double(),
## .. conf_rr_nfqr_tx = col_double(),
## .. conf_rr_fqr_tx = col_double(),
## .. mdrxdr_bdq_used = col_double(),
## .. mdrxdr_bdq_tx = col_double(),
## .. mdrxdr_alloral_used = col_double(),
## .. mdrxdr_alloral_tx = col_double(),
## .. mdrxdr_dlm_used = col_double(),
## .. mdrxdr_dlm_tx = col_double(),
## .. mdr_shortreg_used = col_double(),
## .. mdr_shortreg_tx = col_double(),
## .. mdr_tx_adverse_events = col_double(),
## .. mdr_alloral_short_used = col_double(),
## .. mdr_alloral_short_tx = col_double(),
## .. mdr_tx_adsm = col_double(),
## .. newrel_tbhiv_flg = col_double(),
## .. newrel_hivtest = col_double(),
## .. newrel_hivpos = col_double(),
## .. newrel_art = col_double(),
## .. tbhiv_014_flg = col_double(),
## .. newrel_hivtest_014 = col_double(),
## .. newrel_hivpos_014 = col_double(),
## .. newrel_art_014 = col_double(),
## .. hivtest = col_double(),
## .. hivtest_pos = col_double(),
## .. hiv_cpt = col_double(),
## .. hiv_art = col_double(),
## .. hiv_tbscr = col_double(),
## .. hiv_reg = col_double(),
## .. hiv_ipt = col_double(),
## .. hiv_reg_new = col_double(),
## .. hiv_ipt_reg_all = col_double(),
## .. hiv_reg_all = col_double(),
## .. hiv_tbdetect = col_double(),
## .. hiv_reg_new2 = col_double(),
## .. hiv_elig_all_tpt = col_double(),
## .. hiv_elig_all = col_double(),
## .. hiv_elig_new_tpt = col_double(),
## .. hiv_elig_new = col_double(),
## .. hiv_all_tpt = col_double(),
## .. hiv_all = col_double(),
## .. hiv_new_tpt = col_double(),
## .. hiv_new = col_double(),
## .. hiv_all_tpt_completed = col_double(),
## .. hiv_all_tpt_started = col_double()
## .. )
## - attr(*, "problems")=<externalptr>
tuber <- tuberculosis %>%
gather(key = type, value = cases, new_sp_m04:newrel_f65,-agegroup_option,
-rel_in_agesex_flg, na.rm = TRUE) %>%
select(country, g_whoregion, year, type, cases) %>%
mutate(
type = stringr::str_replace(type, "newrel", "new_rel")
) %>%
separate(type, c("new", "var", "sexage")) %>%
mutate(sexage = stringr::str_replace(sexage, "unk", "u")) %>%
mutate(sexage = stringr::str_replace(sexage, "sexu", "u")) %>%
separate(sexage, c("sex", "age"), sep = 1) %>%
#filtering for only active cases i.e cases greater than one and from 2000 till 2020
filter(cases > 0, year >= 2000) %>%
# the 15plus values are removed from the data set because it is the combination of all ages from 15 and above thus causing the number of cases to be repeated and calculated repeatedly thereby providing an inaccurate analysis
filter(age != "15plus")
tuber %>% head(10)
## # A tibble: 10 × 8
## country g_whoregion year new var sex age cases
## <chr> <chr> <dbl> <chr> <chr> <chr> <chr> <dbl>
## 1 Afghanistan EMR 2010 new sp m 04 4
## 2 Afghanistan EMR 2011 new sp m 04 2
## 3 Albania EUR 2006 new sp m 04 1
## 4 Albania EUR 2008 new sp m 04 1
## 5 Angola AFR 2011 new sp m 04 108
## 6 Angola AFR 2012 new sp m 04 58
## 7 Argentina AMR 2006 new sp m 04 19
## 8 Argentina AMR 2007 new sp m 04 14
## 9 Argentina AMR 2008 new sp m 04 11
## 10 Argentina AMR 2009 new sp m 04 8
tuber$age[tuber$age == "04"] <- "0-19"
tuber$age[tuber$age == "514"] <- "0-19"
tuber$age[tuber$age == "014"] <- "0-19"
tuber$age[tuber$age == "59"] <- "0-19"
tuber$age[tuber$age == "1014"] <- "0-19"
tuber$age[tuber$age == "1519"] <- "0-19"
tuber$age[tuber$age == "1524"] <- "20-34"
tuber$age[tuber$age == "2024"] <- "20-34"
tuber$age[tuber$age == "2534"] <- "20-34"
tuber$age[tuber$age == "3544"] <- "35-44"
tuber$age[tuber$age == "4554"] <- "45-64"
tuber$age[tuber$age == "5564"] <- "45-64"
tuber$age[tuber$age == "65"] <- "65+"
#changing the abbreviation of the g_whoregion for proper comprehension
tuber$g_whoregion[tuber$g_whoregion == "AFR"] <- "Africa"
tuber$g_whoregion[tuber$g_whoregion == "AMR"] <- "America"
tuber$g_whoregion[tuber$g_whoregion == "EMR"] <- "Eastern Mediterrenian"
tuber$g_whoregion[tuber$g_whoregion == "EUR"] <- "Europe"
tuber$g_whoregion[tuber$g_whoregion == "SEA"] <- "South-East Asia"
tuber$g_whoregion[tuber$g_whoregion == "WPR"] <- "Western Pacific"
kable(head(tuber))
country | g_whoregion | year | new | var | sex | age | cases |
---|---|---|---|---|---|---|---|
Afghanistan | Eastern Mediterrenian | 2010 | new | sp | m | 0-19 | 4 |
Afghanistan | Eastern Mediterrenian | 2011 | new | sp | m | 0-19 | 2 |
Albania | Europe | 2006 | new | sp | m | 0-19 | 1 |
Albania | Europe | 2008 | new | sp | m | 0-19 | 1 |
Angola | Africa | 2011 | new | sp | m | 0-19 | 108 |
Angola | Africa | 2012 | new | sp | m | 0-19 | 58 |
Note: The data was briefly summarized according to whoregion to show common trends of year, age and sex across regions
# showing the trend of tuberculosis cases from 1980 - 2020 for each WHO regions
ggplot(data = tuber, mapping = aes(x = year, y = cases)) +
geom_col() +
facet_wrap(~g_whoregion)
# showing the distribution of tuberculosis cases according to age categories for each WHO regions
ggplot(data = tuber, mapping = aes(x = age, y = cases)) +
geom_col() +
facet_wrap(~g_whoregion)
# showing the distribution of tuberculosis cases between male and female
ggplot(data = tuber, mapping = aes(x = sex, y = cases)) +
geom_col() +
facet_wrap(~g_whoregion)
tuber %>%
group_by(., g_whoregion) %>%
summarise(., cases = sum(cases)) %>%
arrange(., desc(cases)) %>% head()
## # A tibble: 6 × 2
## g_whoregion cases
## <chr> <dbl>
## 1 South-East Asia 35196995
## 2 Western Pacific 20327430
## 3 Africa 18562209
## 4 Eastern Mediterrenian 6061551
## 5 Europe 4418793
## 6 America 4004211
Which year has the highest tuberculosis cases?
by_year <- tuber %>%
group_by(.,year) %>%
summarise(., total_cases = sum(cases))
kable(by_year)
year | total_cases |
---|---|
2000 | 1148524 |
2001 | 1237444 |
2002 | 1523369 |
2003 | 1860645 |
2004 | 2184958 |
2005 | 2364897 |
2006 | 3064892 |
2007 | 3933707 |
2008 | 3680748 |
2009 | 3932522 |
2010 | 4295276 |
2011 | 4390442 |
2012 | 4377419 |
2013 | 3422746 |
2014 | 5674335 |
2015 | 6041262 |
2016 | 6524581 |
2017 | 6536013 |
2018 | 7202478 |
2019 | 8237527 |
2020 | 6937404 |
plotting the trend of tuberculosis cases by year
by_year %>%
ggplot(aes (x = year)) +
geom_line(aes(y = total_cases))
The year with the highest tuberculosis cases
#What year has the highest number of tuberculosis cases?
by_year %>%
arrange(desc(total_cases)) %>%
head(1)
## # A tibble: 1 × 2
## year total_cases
## <dbl> <dbl>
## 1 2019 8237527
What are the 10 countries with the top tuberculosis cases? What are the 10 countries with the least tuberculosis cases?
# grouping cases by country
by_country <- tuber %>%
group_by(.,country) %>%
summarise(., total_cases = sum(cases))
# selecting the top 10 country with the highest tuberculosis cases through all
# the years
top_10 <- by_country %>%
arrange(desc(total_cases)) %>%
head(10)
kable(top_10)
country | total_cases |
---|---|
India | 21337820 |
China | 13048274 |
Indonesia | 6311768 |
South Africa | 5179958 |
Pakistan | 3717044 |
Philippines | 3567369 |
Bangladesh | 3137372 |
Democratic Republic of the Congo | 1782554 |
Myanmar | 1611582 |
Russian Federation | 1526742 |
# a plot of the most country with the lowest tuberculosis cases
ggplot(top_10, aes(total_cases, fct_reorder(country, total_cases))) +
geom_point()
#### What are the 10 countries with the least tuberculosis
cases?
# selecting the least 10 country with the least tuberculosis cases through all
# the years
least_10 <- by_country %>%
arrange(total_cases) %>%
head(10)
kable(least_10)
country | total_cases |
---|---|
San Marino | 1 |
Tokelau | 2 |
Anguilla | 3 |
Montserrat | 4 |
Monaco | 5 |
Niue | 5 |
British Virgin Islands | 6 |
Cook Islands | 19 |
Bermuda | 20 |
Curaçao | 28 |
#a plot of the least country with the lowest tuberculosis cases
ggplot(least_10, aes(total_cases, fct_reorder(country, total_cases))) +
geom_point()
checking for pattern across age groups and noting age groups that has the highest frequency of tuberculosis cases
# grouping data by age_group
by_age_group <- tuber %>%
group_by(.,age) %>%
summarise(., total_cases = sum(cases))
kable(by_age_group)
age | total_cases |
---|---|
0-19 | 9931668 |
20-34 | 31620389 |
35-44 | 15195618 |
45-64 | 22407587 |
65+ | 8896305 |
u | 519622 |
# a plot of the age group that frequently occurs
ggplot(by_age_group, aes(x = age, y = total_cases)) +
geom_col()
It is shown the the middle aged 20-64 has the highest case of
tuberculosis
by_sex <- tuber %>%
group_by(.,sex) %>%
summarise(., total_cases = sum(cases)) %>%
arrange(., desc(total_cases))
kable(by_sex)
sex | total_cases |
---|---|
m | 54847196 |
f | 33455750 |
u | 268243 |
# a plot of the sex that frequently occurs
ggplot(by_sex, aes(x = sex, y = total_cases)) +
geom_col()
tbhiv1 <- select(tuberculosis_join, country, year,
success_tbhiv_treatment = tbhiv_succ, failed_tbhiv_treatment = tbhiv_fail,
tbhiv_death = tbhiv_died,
tbhiv_lost) %>%
drop_na() %>%
mutate(tbhiv_total = success_tbhiv_treatment + failed_tbhiv_treatment + tbhiv_death + tbhiv_lost)
tbhiv1 %>%
group_by(.,country, year) %>%
summarise(., total_tbhiv_cases = sum(tbhiv_total)) %>%
filter(., total_tbhiv_cases > 0, year == 2019)%>%
arrange(desc(total_tbhiv_cases)) %>%
head(10)
## `summarise()` has grouped output by 'country'. You can override using the
## `.groups` argument.
## # A tibble: 10 × 3
## # Groups: country [10]
## country year total_tbhiv_cases
## <chr> <dbl> <dbl>
## 1 South Africa 2019 95969
## 2 India 2019 35419
## 3 Mozambique 2019 31753
## 4 Kenya 2019 20980
## 5 United Republic of Tanzania 2019 18792
## 6 Zambia 2019 16002
## 7 Uganda 2019 15709
## 8 Zimbabwe 2019 12257
## 9 Nigeria 2019 12214
## 10 Russian Federation 2019 10635
Note: 8 out of the 10 countries with the most TBHIV cases were African countries and a huge amount of the countries are in the southern part of the continent
# viewing countries with the most percentage for year 2019 of hiv to tuberculosis cases estimates
tbhiv1 %>%
full_join(by_year_country, by = c("country", "year")) %>%
select(., country, year, total_cases, tbhiv_total) %>%
drop_na()%>%
mutate(., hivtb_rate = (tbhiv_total/total_cases)*100) %>%
arrange(desc(hivtb_rate)) %>%
filter(., hivtb_rate < 100, total_cases > 100000, year == 2019)
## # A tibble: 12 × 5
## country year total_cases tbhiv_total hivtb_rate
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 South Africa 2019 226005 95969 42.5
## 2 Kenya 2019 111049 20980 18.9
## 3 Nigeria 2019 126611 12214 9.65
## 4 Brazil 2019 111332 6462 5.80
## 5 Myanmar 2019 184951 9509 5.14
## 6 Viet Nam 2019 104124 2643 2.54
## 7 Indonesia 2019 629171 10370 1.65
## 8 India 2019 2883264 35419 1.23
## 9 China 2019 837860 6143 0.733
## 10 Philippines 2019 530626 1516 0.286
## 11 Pakistan 2019 373759 662 0.177
## 12 Bangladesh 2019 303925 117 0.0385
South Africa is the country with the most TBHIV rate all over the world as at 2019 with a TBHIV to total tuberculosis cases rate at 42.5%, Kenya was the second highest with 18.9% and Nigeria third with 9.65%
tbhiv_by_year <- tbhiv1 %>%
group_by(.,year) %>%
summarise(., tbhiv_total = sum(tbhiv_total))
kable(tbhiv_by_year)
year | tbhiv_total |
---|---|
2012 | 363207 |
2013 | 360953 |
2014 | 392464 |
2015 | 446016 |
2016 | 403686 |
2017 | 416406 |
2018 | 380606 |
2019 | 390368 |
# plotting total tuberculosis and HIV cases against year
tbhiv_by_year %>% ggplot(aes(x = year))+
geom_point(aes(y = tbhiv_total))+
geom_line(aes(y = tbhiv_total))
Note: The year with the highest TBHIV rate is 2015 and it has been
falling since then
# calculation the estimated mortality recorded and the rate of mortality to the
# total tuberculosis cases
tb_mortality <- select(tuberculosis_join, country, e_pop_num, e_mort_num) %>%
group_by(., country) %>%
drop_na() %>%
summarise(., total_population = max(e_pop_num, na.rm = TRUE)
, tb_death = mean(e_mort_num, na.rm = TRUE)) %>%
full_join(by_country, by = "country") %>%
mutate(., mortality_rate = (tb_death/total_cases)*100) %>%
drop_na() %>%
arrange(., desc(mortality_rate))
kable(tb_mortality)
country | total_population | tb_death | total_cases | mortality_rate |
---|---|---|---|---|
Anguilla | 15002 | 1.047619e+00 | 3 | 34.9206349 |
Mozambique | 31255435 | 2.376190e+04 | 120779 | 19.6738711 |
Serbia & Montenegro | 10101170 | 3.780000e+02 | 2710 | 13.9483395 |
Nigeria | 206139587 | 1.382857e+05 | 1373968 | 10.0646969 |
Central African Republic | 4829764 | 9.380952e+03 | 120664 | 7.7744417 |
Ghana | 31072945 | 1.647619e+04 | 231460 | 7.1183749 |
South Sudan | 11193729 | 6.730000e+03 | 95267 | 7.0643560 |
United Republic of Tanzania | 59734213 | 6.595238e+04 | 1014853 | 6.4987127 |
Cameroon | 26545864 | 1.809524e+04 | 279837 | 6.4663494 |
Malawi | 19129955 | 1.479048e+04 | 235457 | 6.2816039 |
Guinea-Bissau | 1967998 | 2.052381e+03 | 33490 | 6.1283397 |
Equatorial Guinea | 1402985 | 8.500000e+02 | 14133 | 6.0142928 |
Nepal | 29136808 | 2.261905e+04 | 426709 | 5.3008133 |
Lao People’s Democratic Republic | 7275556 | 4.261905e+03 | 81363 | 5.2381362 |
Lesotho | 2142252 | 6.990476e+03 | 135129 | 5.1731872 |
Kenya | 53771300 | 6.509524e+04 | 1476253 | 4.4094907 |
Papua New Guinea | 8947027 | 6.323810e+03 | 144143 | 4.3871777 |
Grenada | 112519 | 1.952381e+00 | 45 | 4.3386243 |
Somalia | 15893219 | 9.819048e+03 | 226946 | 4.3266009 |
Côte d’Ivoire | 26378275 | 1.375714e+04 | 320104 | 4.2977104 |
Angola | 32866268 | 2.052381e+04 | 484618 | 4.2350490 |
Ethiopia | 114963583 | 5.547619e+04 | 1403111 | 3.9537991 |
Gambia | 2416664 | 6.061905e+02 | 15357 | 3.9473235 |
Madagascar | 27691019 | 1.271429e+04 | 326929 | 3.8890052 |
Gabon | 2225728 | 2.414286e+03 | 62658 | 3.8531165 |
Congo | 5518092 | 4.180952e+03 | 110152 | 3.7956209 |
Eswatini | 1160164 | 3.306190e+03 | 88105 | 3.7525571 |
Niger | 24206636 | 5.028571e+03 | 136131 | 3.6939209 |
Eritrea | 3546427 | 9.442857e+02 | 26391 | 3.5780596 |
Comoros | 869595 | 5.638095e+01 | 1623 | 3.4738726 |
Burundi | 11890781 | 3.804762e+03 | 112033 | 3.3961082 |
Sudan | 44053386 | 9.619048e+03 | 283706 | 3.3904985 |
Liberia | 5057677 | 3.200000e+03 | 94522 | 3.3854552 |
Myanmar | 54409794 | 5.452381e+04 | 1611582 | 3.3832476 |
Mauritania | 4649660 | 1.154286e+03 | 34233 | 3.3718509 |
Saint Kitts and Nevis | 53192 | 1.142857e+00 | 34 | 3.3613445 |
Zambia | 18383956 | 1.814286e+04 | 540084 | 3.3592658 |
Burkina Faso | 20903278 | 2.447619e+03 | 73031 | 3.3514796 |
Namibia | 2540916 | 4.800000e+03 | 145488 | 3.2992412 |
Chad | 16425859 | 4.866667e+03 | 149455 | 3.2562756 |
United Arab Emirates | 9890400 | 5.071429e+01 | 1615 | 3.1402034 |
Barbados | 287371 | 2.523810e+00 | 83 | 3.0407344 |
Democratic Republic of the Congo | 89561404 | 5.414286e+04 | 1782554 | 3.0373754 |
Turkmenistan | 6031187 | 9.304762e+02 | 31503 | 2.9536114 |
Afghanistan | 38928341 | 1.252381e+04 | 424368 | 2.9511673 |
Dominica | 71991 | 2.523810e+00 | 87 | 2.9009305 |
Guinea | 13132792 | 5.214286e+03 | 185716 | 2.8076664 |
Timor-Leste | 1318442 | 9.210526e+02 | 33995 | 2.7093768 |
India | 1380004385 | 5.642857e+05 | 21337820 | 2.6445331 |
Uganda | 45741000 | 1.861905e+04 | 722953 | 2.5754161 |
Thailand | 69799978 | 2.042857e+04 | 793538 | 2.5743659 |
Sierra Leone | 7976985 | 4.657143e+03 | 182622 | 2.5501543 |
South Africa | 59308690 | 1.308571e+05 | 5179958 | 2.5262202 |
Botswana | 2351625 | 2.438095e+03 | 101160 | 2.4101376 |
Senegal | 16743930 | 2.995238e+03 | 125159 | 2.3931464 |
Ukraine | 48838058 | 1.009524e+04 | 432085 | 2.3364010 |
Bangladesh | 164689383 | 7.328571e+04 | 3137372 | 2.3358950 |
Curaçao | 164100 | 6.363636e-01 | 28 | 2.2727273 |
Mali | 20250834 | 2.076190e+03 | 97439 | 2.1307592 |
Djibouti | 988002 | 5.195238e+02 | 24853 | 2.0903867 |
Benin | 12123198 | 1.366667e+03 | 66849 | 2.0444085 |
Zimbabwe | 14862927 | 1.154762e+04 | 567028 | 2.0365165 |
Algeria | 43851043 | 2.957143e+03 | 149070 | 1.9837277 |
Sao Tome and Principe | 219161 | 4.614286e+01 | 2412 | 1.9130538 |
Togo | 8278737 | 8.190476e+02 | 43522 | 1.8819163 |
Antigua and Barbuda | 97928 | 1.285714e+00 | 69 | 1.8633540 |
Yemen | 29825968 | 2.357143e+03 | 127975 | 1.8418776 |
Indonesia | 273523621 | 1.114286e+05 | 6311768 | 1.7654098 |
Iceland | 341250 | 3.190476e+00 | 191 | 1.6704064 |
Libya | 6871287 | 4.357143e+02 | 26284 | 1.6577168 |
Viet Nam | 97338583 | 2.252381e+04 | 1369950 | 1.6441337 |
Saint Vincent and the Grenadines | 110947 | 2.714286e+00 | 171 | 1.5873016 |
Saudi Arabia | 34813867 | 9.585714e+02 | 62551 | 1.5324638 |
Russian Federation | 146404890 | 2.307143e+04 | 1526742 | 1.5111544 |
Belarus | 9871635 | 7.833333e+02 | 57333 | 1.3662870 |
Vanuatu | 307150 | 2.452381e+01 | 1813 | 1.3526646 |
Azerbaijan | 10139175 | 7.828571e+02 | 58866 | 1.3298970 |
Netherlands Antilles | 198662 | 7.000000e-01 | 53 | 1.3207547 |
Pakistan | 220892331 | 4.885714e+04 | 3717044 | 1.3144085 |
Saint Lucia | 183629 | 2.809524e+00 | 216 | 1.3007055 |
Wallis and Futuna Islands | 15098 | 4.285714e-01 | 33 | 1.2987013 |
Dominican Republic | 10847904 | 8.385714e+02 | 66243 | 1.2659019 |
Rwanda | 12952209 | 1.170952e+03 | 93814 | 1.2481638 |
Guyana | 786559 | 1.469048e+02 | 11908 | 1.2336644 |
Bolivia (Plurinational State of) | 11673029 | 1.833333e+03 | 151430 | 1.2106804 |
Puerto Rico | 3670308 | 1.614286e+01 | 1337 | 1.2073940 |
Chile | 19116209 | 5.533333e+02 | 45840 | 1.2070971 |
Estonia | 1399111 | 5.914286e+01 | 4949 | 1.1950466 |
Haiti | 11402533 | 3.052381e+03 | 265507 | 1.1496424 |
Ecuador | 17643060 | 9.900000e+02 | 88066 | 1.1241569 |
occupied Palestinian territory, including east Jerusalem | 5101416 | 5.142857e+00 | 459 | 1.1204482 |
Latvia | 2384150 | 1.587619e+02 | 14293 | 1.1107668 |
Finland | 5540718 | 5.857143e+01 | 5301 | 1.1049128 |
Japan | 128555196 | 3.914286e+03 | 360088 | 1.0870359 |
Bhutan | 771612 | 1.514286e+02 | 14613 | 1.0362593 |
Cambodia | 16718971 | 5.357143e+03 | 520488 | 1.0292539 |
France | 65273512 | 7.980952e+02 | 78081 | 1.0221376 |
Uzbekistan | 33469199 | 3.080952e+03 | 301911 | 1.0204836 |
Hungary | 10220509 | 1.811429e+02 | 18091 | 1.0012871 |
Greece | 11234993 | 8.380952e+01 | 8410 | 0.9965461 |
Tajikistan | 9537642 | 8.738095e+02 | 88628 | 0.9859294 |
Lithuania | 3501842 | 2.742857e+02 | 28097 | 0.9762100 |
Turks and Caicos Islands | 38718 | 4.761905e-01 | 49 | 0.9718173 |
Croatia | 4428075 | 9.980952e+01 | 10464 | 0.9538372 |
Republic of Moldova | 4202659 | 5.776190e+02 | 60595 | 0.9532454 |
Jamaica | 2961161 | 1.890476e+01 | 2002 | 0.9442938 |
Suriname | 586634 | 2.066667e+01 | 2217 | 0.9321906 |
Guatemala | 17915567 | 5.090476e+02 | 54672 | 0.9310938 |
Panama | 4314768 | 2.671429e+02 | 29156 | 0.9162535 |
Cabo Verde | 555988 | 3.528571e+01 | 4059 | 0.8693204 |
Portugal | 10604066 | 3.876190e+02 | 44802 | 0.8651825 |
Italy | 60673694 | 4.347619e+02 | 50359 | 0.8633251 |
Aruba | 106766 | 9.047619e-01 | 105 | 0.8616780 |
North Macedonia | 2083458 | 5.319048e+01 | 6343 | 0.8385697 |
Honduras | 9904608 | 4.880952e+02 | 58232 | 0.8381908 |
Philippines | 109581085 | 2.942857e+04 | 3567369 | 0.8249377 |
Bahamas | 393248 | 7.380952e+00 | 895 | 0.8246874 |
Mauritius | 1271767 | 1.895238e+01 | 2322 | 0.8162093 |
Bosnia and Herzegovina | 3765422 | 1.635238e+02 | 20175 | 0.8105269 |
Mexico | 128932753 | 3.128571e+03 | 389743 | 0.8027268 |
Kazakhstan | 18776707 | 2.295714e+03 | 288492 | 0.7957636 |
Trinidad and Tobago | 1399491 | 3.342857e+01 | 4250 | 0.7865546 |
Peru | 32971846 | 3.166667e+03 | 405028 | 0.7818390 |
Greenland | 56968 | 7.000000e+00 | 907 | 0.7717751 |
Belize | 397621 | 1.390476e+01 | 1818 | 0.7648384 |
Iraq | 40222503 | 1.104762e+03 | 145762 | 0.7579218 |
Costa Rica | 5094114 | 6.419048e+01 | 8867 | 0.7239255 |
Norway | 5421242 | 3.233333e+01 | 4490 | 0.7201188 |
Marshall Islands | 59194 | 2.204762e+01 | 3087 | 0.7142086 |
Micronesia (Federated States of) | 115021 | 1.723810e+01 | 2416 | 0.7134973 |
Venezuela (Bolivarian Republic of) | 30081827 | 9.666667e+02 | 139174 | 0.6945742 |
Solomon Islands | 686878 | 4.928571e+01 | 7115 | 0.6927015 |
Poland | 38556699 | 7.961905e+02 | 115385 | 0.6900294 |
Kyrgyzstan | 6524191 | 7.090476e+02 | 102769 | 0.6899431 |
Nicaragua | 6624554 | 2.480952e+02 | 36682 | 0.6763405 |
Armenia | 3069597 | 1.391429e+02 | 20906 | 0.6655642 |
Colombia | 50882884 | 1.519048e+03 | 230162 | 0.6599906 |
Serbia | 9193818 | 1.542500e+02 | 23736 | 0.6498568 |
New Caledonia | 285491 | 4.476191e+00 | 691 | 0.6477844 |
Sri Lanka | 21413250 | 1.010476e+03 | 159305 | 0.6343029 |
Slovenia | 2078932 | 1.823810e+01 | 2939 | 0.6205544 |
Austria | 9006400 | 5.785714e+01 | 9390 | 0.6161570 |
Uruguay | 3473727 | 8.914286e+01 | 14634 | 0.6091489 |
Bulgaria | 7997951 | 2.087619e+02 | 34331 | 0.6080857 |
Paraguay | 7132530 | 2.766667e+02 | 45569 | 0.6071379 |
Sint Maarten (Dutch part) | 42882 | 1.818182e-01 | 30 | 0.6060606 |
Fiji | 896444 | 3.128571e+01 | 5229 | 0.5983116 |
French Polynesia | 280904 | 5.333333e+00 | 900 | 0.5925926 |
Slovakia | 5459643 | 4.109524e+01 | 6944 | 0.5918093 |
China, Macao SAR | 649342 | 3.661905e+01 | 6336 | 0.5779521 |
Spain | 47084242 | 5.247619e+02 | 91350 | 0.5744520 |
Mongolia | 3278292 | 3.828571e+02 | 66759 | 0.5734914 |
Germany | 83783945 | 4.361905e+02 | 76671 | 0.5689119 |
Kiribati | 119446 | 4.185714e+01 | 7376 | 0.5674775 |
Seychelles | 98340 | 1.428571e+00 | 257 | 0.5558644 |
Czechia | 10708982 | 5.971429e+01 | 10832 | 0.5512766 |
Sweden | 10099270 | 5.119048e+01 | 9327 | 0.5488418 |
Ireland | 4937796 | 3.295238e+01 | 6056 | 0.5441278 |
Romania | 22137423 | 1.607619e+03 | 296657 | 0.5419117 |
Andorra | 84461 | 5.238095e-01 | 97 | 0.5400098 |
Nauru | 10834 | 6.190476e-01 | 116 | 0.5336617 |
Northern Mariana Islands | 58412 | 4.095238e+00 | 777 | 0.5270577 |
Morocco | 36910558 | 2.785714e+03 | 530361 | 0.5252487 |
Tonga | 105697 | 1.333333e+00 | 258 | 0.5167959 |
Brazil | 212559409 | 7.542857e+03 | 1488878 | 0.5066135 |
Lebanon | 6859408 | 5.414286e+01 | 10731 | 0.5045462 |
Republic of Korea | 51269183 | 3.023810e+03 | 600809 | 0.5032897 |
Tuvalu | 11792 | 1.952381e+00 | 391 | 0.4993302 |
United States of America | 331002647 | 9.514286e+02 | 192395 | 0.4945183 |
Palau | 19861 | 1.095238e+00 | 224 | 0.4889456 |
Malaysia | 32365998 | 1.847619e+03 | 382075 | 0.4835750 |
China | 1439323774 | 6.280952e+04 | 13048274 | 0.4813627 |
Canada | 37742157 | 1.331429e+02 | 27800 | 0.4789311 |
Denmark | 5792203 | 2.647619e+01 | 5563 | 0.4759337 |
Belgium | 11589616 | 7.457143e+01 | 16070 | 0.4640412 |
Argentina | 45195777 | 8.595238e+02 | 185463 | 0.4634476 |
Luxembourg | 625976 | 2.285714e+00 | 494 | 0.4626952 |
Egypt | 102334403 | 7.119048e+02 | 158614 | 0.4488285 |
Cyprus | 1207361 | 3.380952e+00 | 755 | 0.4478083 |
Israel | 8655541 | 3.533333e+01 | 7941 | 0.4449482 |
Iran (Islamic Republic of) | 83992953 | 7.623810e+02 | 177944 | 0.4284387 |
Maldives | 540542 | 9.904762e+00 | 2326 | 0.4258281 |
Brunei Darussalam | 437483 | 1.685714e+01 | 3963 | 0.4253632 |
United Kingdom of Great Britain and Northern Ireland | 67886004 | 4.709524e+02 | 113020 | 0.4166983 |
Guam | 168783 | 6.428571e+00 | 1546 | 0.4158196 |
Netherlands | 17134873 | 6.390476e+01 | 15594 | 0.4098035 |
Tunisia | 11818618 | 1.657143e+02 | 41880 | 0.3956884 |
Switzerland | 8654618 | 3.485714e+01 | 8880 | 0.3925354 |
Cuba | 11339255 | 5.209524e+01 | 13321 | 0.3910760 |
Georgia | 4362184 | 2.342857e+02 | 60364 | 0.3881216 |
New Zealand | 4822233 | 1.809524e+01 | 5207 | 0.3475175 |
Türkiye | 84339067 | 7.809524e+02 | 228680 | 0.3415045 |
Jordan | 10203140 | 2.204762e+01 | 6639 | 0.3320925 |
Cayman Islands | 65720 | 1.428571e-01 | 44 | 0.3246753 |
El Salvador | 6486201 | 1.328571e+02 | 41875 | 0.3172708 |
Oman | 5106622 | 1.938095e+01 | 6177 | 0.3137600 |
Australia | 25499881 | 6.438095e+01 | 21080 | 0.3054125 |
Malta | 441539 | 2.666667e+00 | 900 | 0.2962963 |
Bahrain | 1701583 | 9.142857e+00 | 3107 | 0.2942664 |
China, Hong Kong SAR | 7496988 | 2.285714e+02 | 83901 | 0.2724299 |
Singapore | 5850343 | 8.533333e+01 | 31895 | 0.2675445 |
Bermuda | 66260 | 4.761900e-02 | 20 | 0.2380952 |
Albania | 3129701 | 1.404762e+01 | 7744 | 0.1814000 |
Kuwait | 4270563 | 2.247619e+01 | 12708 | 0.1768665 |
Montenegro | 628062 | 2.250000e+00 | 1574 | 0.1429479 |
Samoa | 198410 | 2.047619e+00 | 1687 | 0.1213764 |
American Samoa | 59684 | 4.761900e-02 | 45 | 0.1058201 |
Syrian Arab Republic | 21362541 | 4.419048e+01 | 63208 | 0.0699128 |
Qatar | 2881060 | 5.809524e+00 | 8619 | 0.0674037 |
British Virgin Islands | 30237 | 0.000000e+00 | 6 | 0.0000000 |
Cook Islands | 19094 | 0.000000e+00 | 19 | 0.0000000 |
Monaco | 39244 | 0.000000e+00 | 5 | 0.0000000 |
Montserrat | 4999 | 0.000000e+00 | 4 | 0.0000000 |
Niue | 1902 | 0.000000e+00 | 5 | 0.0000000 |
San Marino | 33938 | 0.000000e+00 | 1 | 0.0000000 |
Tokelau | 1550 | 0.000000e+00 | 2 | 0.0000000 |
# visualising countries with the highest tb mortality rate and has tuberculosis
# cases of over 100000
tb_mortality %>%
filter(total_cases > 100000) %>%
head(10)
## # A tibble: 10 × 5
## country total_population tb_death total_cases mortality…¹
## <chr> <dbl> <dbl> <dbl> <dbl>
## 1 Mozambique 31255435 23762. 120779 19.7
## 2 Nigeria 206139587 138286. 1373968 10.1
## 3 Central African Republic 4829764 9381. 120664 7.77
## 4 Ghana 31072945 16476. 231460 7.12
## 5 United Republic of Tanzania 59734213 65952. 1014853 6.50
## 6 Cameroon 26545864 18095. 279837 6.47
## 7 Malawi 19129955 14790. 235457 6.28
## 8 Nepal 29136808 22619. 426709 5.30
## 9 Lesotho 2142252 6990. 135129 5.17
## 10 Kenya 53771300 65095. 1476253 4.41
## # … with abbreviated variable name ¹mortality_rate
Note: All top 10 countries with highest tuberculosis mortality rate are African countries. This shows the lack and access to quality health care services in virtually all countries in Africa.
#worldwide tuberculosis death rate
tb_mortality %>%
summarise(., total_cases = sum(total_cases, na.rm = TRUE),
total_death = sum(tb_death, na.rm = TRUE)) %>%
mutate(., rate = (total_death/total_cases)*100)
## # A tibble: 1 × 3
## total_cases total_death rate
## <dbl> <dbl> <dbl>
## 1 87203222 1925091. 2.21
Note: The global tuberculosis mortality rate is 2.21
After thoroughly analyzing the WHO data set and answering questions through visualizations, below are the insight gotten from the data by the Exploratory analysis carried out on the data. #### The first question “What has been the trend of tuberculosis cases from 2000 to 2020?” reveals the tuberculosis trends throughout all the years. I was seen that from 2000 till 2020 the trend has been progressing upward but a drastic drop was recorded in 2013. 2019 is the year with the highest ever recorded tuberculosis cases ever. #### The second question “What is the trend of tuberculosis cases in countries across the globe?” gave us insight to know countries that are heavy burdened with tuberculosis, we further checked the least burdened countries with tuberculosis. The top and least ten countries with tuberculosis cases were analyze and plotted for easy visualization. The nine out of ten countries with the most tuberculosis cases were Asian countries this is due to the large population in the region. #### The third question “What is the pattern of tuberculosis cases across age groups and sex” reveals the age group that are predominant. It was gathered that what causes tuberculosis is the weakening of the immune system of individuals by the bacteria Mycobacterium tuberculosis. Mid-aged individuals (20-64) mostly engage in activities and habit such as smoking, alcohol intake and substance abuse that weakens the human immune system. Checking the total cases amongst male and females. The male gender are likely possible of engaging in health hazarduous activities than their female conterpart. #### The fourth question “Tuberculosis cases with HIV across the globe”