An R package to analyze the parliamentary records of the 19th legislative period of the Bundestag, the German parliament.
Nevar pievienot vairāk kā 25 tēmas Tēmai ir jāsākas ar burtu vai ciparu, tā var saturēt domu zīmes ('-') un var būt līdz 35 simboliem gara.

72 rindas
1.6KB

  1. ---
  2. title: "funwithdata"
  3. output: rmarkdown::html_vignette
  4. vignette: >
  5. %\VignetteIndexEntry{funwithdata}
  6. %\VignetteEngine{knitr::rmarkdown}
  7. %\VignetteEncoding{UTF-8}
  8. ---
  9. ```{r, include = FALSE}
  10. knitr::opts_chunk$set(
  11. collapse = TRUE,
  12. comment = "#>"
  13. )
  14. ```
  15. ```{r setup}
  16. library(hateimparlament)
  17. library(dplyr)
  18. library(ggplot2)
  19. ```
  20. ## Preparation of data
  21. First, you need to download all records of the current legislative period.
  22. ```r
  23. fetch_all("../records/") # path to directory where records should be stored
  24. ```
  25. Second, those `.xml` files, need to be parsed into `R` `tibbles`. This is accomplished by:
  26. ```r
  27. read_all("../records/") %>% repair() -> res
  28. reden <- res$reden
  29. redner <- res$redner
  30. talks <- res$talks
  31. ```
  32. We also used `repair` to fix a bunch of formatting issues in the records and unpacked
  33. the result into more descriptive variables.
  34. For development purposes, we load the tables from csv files.
  35. ```{r}
  36. tables <- read_from_csv('../csv/')
  37. comments <- tables$comments
  38. reden <- tables$reden
  39. redner <- tables$redner
  40. talks <- tables$talks
  41. ```
  42. ## Analysis
  43. Now we can start analysing our parsed dataset, e.g. find out which party gives the most talks:
  44. ```{r}
  45. left_join(reden, redner, by=c("redner" = "id")) %>%
  46. group_by(fraktion) %>%
  47. summarize(n = n()) %>%
  48. ggplot(aes(x = fraktion, y = n)) +
  49. geom_bar(stat = "identity")
  50. ```
  51. ### Count a word occurence
  52. ```{r}
  53. find_word(res, "hitler") %>%
  54. filter(occurences > 0) %>%
  55. join_redner(res) %>%
  56. select(content, fraktion) %>%
  57. group_by(fraktion) %>%
  58. summarize(n = n()) %>%
  59. arrange(desc(n))
  60. ```