| @@ -1,106 +1,97 @@ | |||||
| # How to develop | |||||
| # Description | |||||
| ```r | |||||
| # everything works with devtools (loads some other packages too) | |||||
| library(devtools) | |||||
| # reload all package functions | |||||
| load_all() | |||||
| R package to analyze parliamentary records of the 19th legislative period of the Bundestag, | |||||
| the German parliament. | |||||
| #write to CSV files to speed up loading | |||||
| tables <- read_all() | |||||
| tables <- repair(tables) | |||||
| write_to_csv(tables) | |||||
| ``` | |||||
| We NEVER use source(...), etc.! Also NEVER use library(...). | |||||
| But to add new packages (as dependency), use: | |||||
| ```r | |||||
| use_package("my-good-old-package") | |||||
| ``` | |||||
| To make package imports available, you have to add them to `R/hateimparlament-package.R` | |||||
| as `@import <package>`. | |||||
| To reload / create documentation (calls roxygen) | |||||
| ```r | |||||
| document() | |||||
| ``` | |||||
| # Features | |||||
| Build vignettes | |||||
| ```r | |||||
| rmarkdown::render("vignettes/bla.Rmd") | |||||
| ``` | |||||
| The package mainly supplies 4 functionalities: | |||||
| # Download | |||||
| ## Download records | |||||
| Before parsing, fetch.R must be run to download all protocols. | |||||
| To analyze records, they need to be downloaded. This is done with `fetch_all`: | |||||
| ```r | ```r | ||||
| fetch_all("../inst/records/") # path to directory where records should be stored | |||||
| fetch_all("records/", create = TRUE) # path to directory where records should be stored | |||||
| ``` | ``` | ||||
| This downloads all parliamentary records and stores them as `.xml` files in the given directory. | |||||
| # Parsing | |||||
| ## tables | |||||
| ## Parse records | |||||
| parse.R parses all downloaded logs and creates 5 tibbles. | |||||
| repair.R then cleans up the errors in these tibbles. | |||||
| To use the records in R, they are converted to `tibble`s with | |||||
| ```r | ```r | ||||
| read_all("../inst/records/") %>% repair() | |||||
| res_raw <- read_all("records/") # path to directory where records are stored | |||||
| ``` | ``` | ||||
| `res_raw` is a named list with 5 `tibble`s: | |||||
| ### Speaker | ### Speaker | ||||
| structure: `id` , `first_name` , `last_name` , `fraction` , `title` , `role_short`, `role_long`. | |||||
| Table of all speakers of this legislative period. | |||||
| Obtained from the `<speaker list>` entry at the end of the transcripts. | |||||
| Fields: | |||||
| - `id`: Unique speaker id | |||||
| - `prename`: Prename | |||||
| - `lastname`: Surname | |||||
| - `fraction`: Name of fraction if the speaker is member of parliament. | |||||
| - `title`: Title, e.g. ,,Prof'' | |||||
| - `role_short`: Short name of role, e.g. ,,Bundeskanzlerin'' | |||||
| - `role_long`: Long name of role | |||||
| ### Speeches | ### Speeches | ||||
| Structure: `id` , `speaker` | |||||
| The speeches `id` is specified in the protocol and is unique.A speech is a `<speech>` entry in the session history. A speech always has a main speaker (the one standing at the front of the lectern). | |||||
| Within a speech, there can be different speech entries: | |||||
| - Comments: Applause, interjections, etc. | |||||
| - Speeches: Typically mainly the main speaker, but also interjections. | |||||
| These are stored in the talks, comments and applause tables when parsing. | |||||
| Table of all speeches given during this legislative period. | |||||
| Fields: | |||||
| - `id`: Unique speech id | |||||
| - `speaker`: Principal speaker (the person standing behind the lectern during the speech). | |||||
| - `date`: Date of session | |||||
| ### Talks | ### Talks | ||||
| Structure: `speech_id` , `speaker` , `content`. | |||||
| Within a speech, there can be multiple talks by different people. Mostly this is the main speech | |||||
| by the principal speaker, but usually there are questions by other members of parliament or | |||||
| order calls by the president of the Bundestag. | |||||
| These are the actual talk entries that appear within _speeches_. | |||||
| Fields: | |||||
| - `speech_id`: Speech in which this talk has been given | |||||
| - `speaker`: Person that actually talks | |||||
| - `content`: Spoken content | |||||
| - `speech_id`: the speech in which the contribution appears. | |||||
| - `speaker`: The speaker of the speech entry. | |||||
| - `content`: The content of the speech. | |||||
| ###comments | |||||
| ### Comments | |||||
| These are the interjections that appear during the speeches. | These are the interjections that appear during the speeches. | ||||
| They have the following structure: | |||||
| - `speech_id`: The speech that was interrupted. | |||||
| - `on_speaker`: The speaker who was interrupted. | |||||
| - `fraction` | |||||
| - `commenter`: The person who interrupted the speech. | |||||
| - `comment`: The content of the comment. | |||||
| Fields: | |||||
| - `speech_id`: The speech that was interrupted | |||||
| - `on_speaker`: The speaker who was interrupted | |||||
| - `fraction`: The fraction of the commenter | |||||
| - `commenter`: The person who interrupted the speech | |||||
| - `comment`: The content of the comment | |||||
| ### Applause | |||||
| Table containing all the rounds of applause that happened during this legislative period. | |||||
| ###applause | |||||
| Fields: | |||||
| - `speech_id`: Speech during which was applauded | |||||
| - `on_speaker`: Speaker who was applauded | |||||
| The logical table shows which party applauded for which speaker with explicit speech and which did not. | |||||
| And then logical fields `CDU_CSU`, `SPD`, `FDP`, `DIE_LINKE`, `BUENDNIS_90_DIE_GRUENEN`, `AfD` | |||||
| for every fraction in the Bundestag, signifying whether this fraction applauded. | |||||
| structure: `speech_id`, `on_speaker`, `CDU_CSU`, `SPD`, `FDP`, `DIE_LINKE`, `BUENDNIS_90_DIE_GRUENEN`, `AfD` | |||||
| ## Repair records | |||||
| The parliamentary records usually contain some major and minor formatting issues. These are | |||||
| mostly resolved by using | |||||
| ``` | |||||
| res <- repair(res_raw) | |||||
| ``` | |||||
| By passing `lookup_speaker = TRUE`, even commenters in | |||||
| `res_raw$comments$ are matched with their respective speaker id. | |||||
| # Analysis | |||||
| ## Analysis | |||||
| analysis.R provides some functions to analyze the "Plenarprotokolle" and to create plots. | |||||
| `analyze.R` provides some functions to analyze the parliamentary records and draw some plots. | |||||
| In the vignettes you can find different analyses of the protocols, for example: | In the vignettes you can find different analyses of the protocols, for example: | ||||
| @@ -110,4 +101,44 @@ In the vignettes you can find different analyses of the protocols, for example: | |||||
| - "When are which topics discussed the most?" | - "When are which topics discussed the most?" | ||||
| - ... | - ... | ||||
| # Contributing | |||||
| Developing works the easiest with `devtools`: | |||||
| ```r | |||||
| library(devtools) | |||||
| ``` | |||||
| When you changed something or added some functionality, you can reload all package functions with | |||||
| ```r | |||||
| load_all() | |||||
| ``` | |||||
| If you want to avoid reading all records every time you start a new R session, you can | |||||
| write your parsed tibbles to CSV files: | |||||
| ``` | |||||
| tables <- read_all() | |||||
| tables <- repair(tables) | |||||
| write_to_csv(tables, "path/to/csv/") | |||||
| ``` | |||||
| Then later you can use | |||||
| ```r | |||||
| res <- read_from_csv("path/to/csv/") | |||||
| ``` | |||||
| to load your stored tibbles very fast. | |||||
| NEVER use source(...), etc.! Also NEVER use library(...). | |||||
| To add new packages (as dependency), use: | |||||
| ```r | |||||
| use_package("my-good-old-package") | |||||
| ``` | |||||
| To make package imports available, you have to add them to `R/hateimparlament-package.R` | |||||
| as `@import <package>`. | |||||
| To reload / create documentation (calls roxygen) | |||||
| ```r | |||||
| document() | |||||
| ``` | |||||
| Build vignettes | |||||
| ```r | |||||
| rmarkdown::render("vignettes/test.Rmd") | |||||
| ``` | |||||