# Description R package to analyze parliamentary records of the 19th legislative period of the Bundestag, the German parliament. # Installation Using the `remotes` package, this is easily installed via: ```r remotes::install_url("https://git.flavigny.de/christian/hateimparlament/archive/master.zip") ``` If you want to build the vignettes, pass `build_vignettes = TRUE`. This takes a long time and fails sometimes, if bundestag.de times out, since in the beginning the necessary records are neither fetched nor parsed. ## Install with vignettes An alternative for building the vignettes is to clone the repository and build the vignettes manually, e.g. on Linux ``` git clone https://git.flavigny.de/christian/hateimparlament cd hateimparlament ``` Then open a `R` shell and do ``` devtools::load_all() # load package devtools::wd() # set working directory ``` Then fetch all records, read them and write the parsed tibbles to csv files. ``` fetch_all(create = TRUE) read_all() %>% repair() -> res write_to_csv(create = TRUE) ``` Now you can build the vignettes faster by using ``` devtools::install(build_vignettes = T) ``` This creates a `doc/` directory with all built vignettes. # Features The package mainly supplies 4 functionalities: ## Download records To analyze records, they need to be downloaded. This is done with `fetch_all`: ```r fetch_all("records/", create = TRUE) # path to directory where records should be stored ``` This downloads all parliamentary records and stores them as `.xml` files in the given directory. ## Parse records To use the records in R, they are converted to `tibble`s with ```r res_raw <- read_all("records/") # path to directory where records are stored ``` `res_raw` is a named list with 5 `tibble`s: ### Speaker Table of all speakers of this legislative period. Fields: - `id`: Unique speaker id - `prename`: Prename - `lastname`: Surname - `fraction`: Name of fraction if the speaker is member of parliament. - `title`: Title, e.g. ,,Prof'' - `role_short`: Short name of role, e.g. ,,Bundeskanzlerin'' - `role_long`: Long name of role ### Speeches Table of all speeches given during this legislative period. Fields: - `id`: Unique speech id - `speaker`: Principal speaker (the person standing behind the lectern during the speech). - `date`: Date of session ### Talks Within a speech, there can be multiple talks by different people. Mostly this is the main speech by the principal speaker, but usually there are questions by other members of parliament or order calls by the president of the Bundestag. Fields: - `speech_id`: Speech in which this talk has been given - `speaker`: Person that actually talks - `content`: Spoken content ### Comments These are the interjections that appear during the speeches. Fields: - `speech_id`: The speech that was interrupted - `on_speaker`: The speaker who was interrupted - `fraction`: The fraction of the commenter - `commenter`: The person who interrupted the speech - `comment`: The content of the comment ### Applause Table containing all the rounds of applause that happened during this legislative period. Fields: - `speech_id`: Speech during which was applauded - `on_speaker`: Speaker who was applauded And then logical fields `CDU_CSU`, `SPD`, `FDP`, `DIE_LINKE`, `BUENDNIS_90_DIE_GRUENEN`, `AfD` for every fraction in the Bundestag, signifying whether this fraction applauded. ## Repair records The parliamentary records usually contain some major and minor formatting issues. These are mostly resolved by using ``` res <- repair(res_raw) ``` By passing `lookup_speaker = TRUE`, even commenters in `res_raw$comments` are matched with their respective speaker id. ## Analysis Also some functions are provided to analyze the parliamentary records and draw some plots: - `bar_plot_fractions` - `find_word` - `join_speaker` - `word_usage_by_date` See their usage with the `?` operator. In the vignettes you can find different analyses of the protocols, for example: - "Who talks the most?" - "Which party gives the most speeches?" - "Which party comments the most on which parties?" - "When are which topics discussed the most?" - ... # Contributing Developing works the easiest with `devtools`: ```r library(devtools) ``` When you changed something or added some functionality, you can reload all package functions with ```r load_all() ``` If you want to avoid reading all records every time you start a new R session, you can write your parsed tibbles to CSV files: ``` tables <- read_all() tables <- repair(tables) write_to_csv(tables, "path/to/csv/") ``` Then later you can use ```r res <- read_from_csv("path/to/csv/") ``` to load your stored tibbles very fast. NEVER use source(...), etc.! Also NEVER use library(...). To add new packages (as dependency), use: ```r use_package("my-good-old-package") ``` To make package imports available, you have to add them to `R/hateimparlament-package.R` as `@import `. To reload / create documentation (calls roxygen) ```r document() ``` Build vignettes ```r rmarkdown::render("vignettes/test.Rmd") ```