From 864c0016cc1b9516963b3936e2bfa4c17d54f39a Mon Sep 17 00:00:00 2001 From: flavis Date: Tue, 10 Aug 2021 20:54:37 +0200 Subject: [PATCH 1/4] update readme --- README.md | 169 ++++++++++++++++++++++++++++++++---------------------- 1 file changed, 100 insertions(+), 69 deletions(-) diff --git a/README.md b/README.md index 0030d85..0a197e0 100644 --- a/README.md +++ b/README.md @@ -1,106 +1,97 @@ -# How to develop +# Description -```r -# everything works with devtools (loads some other packages too) -library(devtools) - -# reload all package functions -load_all() +R package to analyze parliamentary records of the 19th legislative period of the Bundestag, +the German parliament. -#write to CSV files to speed up loading -tables <- read_all() -tables <- repair(tables) -write_to_csv(tables) -``` -We NEVER use source(...), etc.! Also NEVER use library(...). -But to add new packages (as dependency), use: -```r -use_package("my-good-old-package") -``` -To make package imports available, you have to add them to `R/hateimparlament-package.R` -as `@import `. - -To reload / create documentation (calls roxygen) -```r -document() -``` +# Features -Build vignettes -```r -rmarkdown::render("vignettes/bla.Rmd") -``` +The package mainly supplies 4 functionalities: -# Download +## Download records -Before parsing, fetch.R must be run to download all protocols. +To analyze records, they need to be downloaded. This is done with `fetch_all`: ```r -fetch_all("../inst/records/") # path to directory where records should be stored +fetch_all("records/", create = TRUE) # path to directory where records should be stored ``` +This downloads all parliamentary records and stores them as `.xml` files in the given directory. -# Parsing - -## tables +## Parse records -parse.R parses all downloaded logs and creates 5 tibbles. -repair.R then cleans up the errors in these tibbles. +To use the records in R, they are converted to `tibble`s with ```r -read_all("../inst/records/") %>% repair() +res_raw <- read_all("records/") # path to directory where records are stored ``` - +`res_raw` is a named list with 5 `tibble`s: ### Speaker -structure: `id` , `first_name` , `last_name` , `fraction` , `title` , `role_short`, `role_long`. - +Table of all speakers of this legislative period. - -Obtained from the `` entry at the end of the transcripts. +Fields: +- `id`: Unique speaker id +- `prename`: Prename +- `lastname`: Surname +- `fraction`: Name of fraction if the speaker is member of parliament. +- `title`: Title, e.g. ,,Prof'' +- `role_short`: Short name of role, e.g. ,,Bundeskanzlerin'' +- `role_long`: Long name of role ### Speeches -Structure: `id` , `speaker` - -The speeches `id` is specified in the protocol and is unique.A speech is a `` entry in the session history. A speech always has a main speaker (the one standing at the front of the lectern). - -Within a speech, there can be different speech entries: - -- Comments: Applause, interjections, etc. -- Speeches: Typically mainly the main speaker, but also interjections. -These are stored in the talks, comments and applause tables when parsing. +Table of all speeches given during this legislative period. +Fields: +- `id`: Unique speech id +- `speaker`: Principal speaker (the person standing behind the lectern during the speech). +- `date`: Date of session ### Talks -Structure: `speech_id` , `speaker` , `content`. +Within a speech, there can be multiple talks by different people. Mostly this is the main speech +by the principal speaker, but usually there are questions by other members of parliament or +order calls by the president of the Bundestag. -These are the actual talk entries that appear within _speeches_. +Fields: +- `speech_id`: Speech in which this talk has been given +- `speaker`: Person that actually talks +- `content`: Spoken content -- `speech_id`: the speech in which the contribution appears. -- `speaker`: The speaker of the speech entry. -- `content`: The content of the speech. - -###comments +### Comments These are the interjections that appear during the speeches. -They have the following structure: -- `speech_id`: The speech that was interrupted. -- `on_speaker`: The speaker who was interrupted. -- `fraction` -- `commenter`: The person who interrupted the speech. -- `comment`: The content of the comment. +Fields: +- `speech_id`: The speech that was interrupted +- `on_speaker`: The speaker who was interrupted +- `fraction`: The fraction of the commenter +- `commenter`: The person who interrupted the speech +- `comment`: The content of the comment + +### Applause + +Table containing all the rounds of applause that happened during this legislative period. -###applause +Fields: +- `speech_id`: Speech during which was applauded +- `on_speaker`: Speaker who was applauded -The logical table shows which party applauded for which speaker with explicit speech and which did not. +And then logical fields `CDU_CSU`, `SPD`, `FDP`, `DIE_LINKE`, `BUENDNIS_90_DIE_GRUENEN`, `AfD` +for every fraction in the Bundestag, signifying whether this fraction applauded. -structure: `speech_id`, `on_speaker`, `CDU_CSU`, `SPD`, `FDP`, `DIE_LINKE`, `BUENDNIS_90_DIE_GRUENEN`, `AfD` +## Repair records +The parliamentary records usually contain some major and minor formatting issues. These are +mostly resolved by using +``` +res <- repair(res_raw) +``` +By passing `lookup_speaker = TRUE`, even commenters in +`res_raw$comments$ are matched with their respective speaker id. -# Analysis +## Analysis -analysis.R provides some functions to analyze the "Plenarprotokolle" and to create plots. +`analyze.R` provides some functions to analyze the parliamentary records and draw some plots. In the vignettes you can find different analyses of the protocols, for example: @@ -110,4 +101,44 @@ In the vignettes you can find different analyses of the protocols, for example: - "When are which topics discussed the most?" - ... +# Contributing + +Developing works the easiest with `devtools`: +```r +library(devtools) +``` +When you changed something or added some functionality, you can reload all package functions with +```r +load_all() +``` +If you want to avoid reading all records every time you start a new R session, you can +write your parsed tibbles to CSV files: + +``` +tables <- read_all() +tables <- repair(tables) +write_to_csv(tables, "path/to/csv/") +``` +Then later you can use +```r +res <- read_from_csv("path/to/csv/") +``` +to load your stored tibbles very fast. + +NEVER use source(...), etc.! Also NEVER use library(...). +To add new packages (as dependency), use: +```r +use_package("my-good-old-package") +``` +To make package imports available, you have to add them to `R/hateimparlament-package.R` +as `@import `. +To reload / create documentation (calls roxygen) +```r +document() +``` + +Build vignettes +```r +rmarkdown::render("vignettes/test.Rmd") +``` From 7e304d12bb1a03a3b513c52c3b71a808db8847db Mon Sep 17 00:00:00 2001 From: flavis Date: Tue, 10 Aug 2021 20:58:29 +0200 Subject: [PATCH 2/4] fix formatting in readme and improve analysis section --- README.md | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 0a197e0..3c9f8d0 100644 --- a/README.md +++ b/README.md @@ -87,11 +87,18 @@ mostly resolved by using res <- repair(res_raw) ``` By passing `lookup_speaker = TRUE`, even commenters in -`res_raw$comments$ are matched with their respective speaker id. +`res_raw$comments` are matched with their respective speaker id. ## Analysis -`analyze.R` provides some functions to analyze the parliamentary records and draw some plots. +Also some functions are provided to analyze the parliamentary records and draw some plots: + +- `bar_plot_fractions` +- `find_word` +- `join_speaker` +- `word_usage_by_date` + +See their usage with the `?` operator. In the vignettes you can find different analyses of the protocols, for example: From 6b1f8a64b2aa20eba1ce1fdd233546a0515692bd Mon Sep 17 00:00:00 2001 From: flavis Date: Tue, 10 Aug 2021 21:14:52 +0200 Subject: [PATCH 3/4] add installation directives --- README.md | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/README.md b/README.md index 3c9f8d0..1636564 100644 --- a/README.md +++ b/README.md @@ -3,6 +3,13 @@ R package to analyze parliamentary records of the 19th legislative period of the Bundestag, the German parliament. +# Installation + +Using the `remotes` package, this is easily installed via: +```r +remotes::install_url("https://git.flavigny.de/christian/hateimparlament/archive/master.zip") +``` + # Features The package mainly supplies 4 functionalities: From 8e691e5d117d838a786b33e123c379edd4c8b955 Mon Sep 17 00:00:00 2001 From: flavis Date: Tue, 10 Aug 2021 21:24:39 +0200 Subject: [PATCH 4/4] update package meta data --- DESCRIPTION | 22 +++++++++++++++------- man/hateimparlament-package.Rd | 21 ++++++++++++++++++--- 2 files changed, 33 insertions(+), 10 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index a9431bd..b256b8d 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -1,13 +1,21 @@ Package: hateimparlament -Title: Protocolanalysis of German Bundestag +Title: Recordanalysis Of Bundestag Version: 0.0.0.9000 -Authors@R: - person(given = "First", - family = "Last", +Authors@R: c( + person(given = "Leon", + family = "Burgard", + role = c("aut")), + person(given = "Josua", + family = "Kugler", + role = c("aut")), + person(given = "Christian", + family = "Merten", role = c("aut", "cre"), - email = "first.last@example.com", - comment = c(ORCID = "YOUR-ORCID-ID")) -Description: Downloads, parses and analyses protocols of the current German parliament (Bundestag). + email = "christian@merten.dev")) +Description: Downloads, parses and analyses parliamentary records of the 19th legislative + period of the German parliament (Bundestag). +URL: https://git.flavigny.de/christian/hateimparlament +BugReports: https://git.flavigny.de/christian/hateimparlament/issues License: GPL (>= 3) Encoding: UTF-8 LazyData: true diff --git a/man/hateimparlament-package.Rd b/man/hateimparlament-package.Rd index 44ff202..bfe2782 100644 --- a/man/hateimparlament-package.Rd +++ b/man/hateimparlament-package.Rd @@ -4,15 +4,30 @@ \name{hateimparlament-package} \alias{hateimparlament} \alias{hateimparlament-package} -\title{hateimparlament: Protocolanalysis of German Bundestag} +\title{hateimparlament: Recordanalysis Of Bundestag} \description{ -Downloads, parses and analyses protocols of the current German parliament (Bundestag). +Downloads, parses and analyses parliamentary records of the 19th legislative + period of the German parliament (Bundestag). } \details{ hateimparlament ist ein großartiges Paket! +} +\seealso{ +Useful links: +\itemize{ + \item \url{https://git.flavigny.de/christian/hateimparlament} + \item Report bugs at \url{https://git.flavigny.de/christian/hateimparlament/issues} +} + } \author{ -\strong{Maintainer}: First Last \email{first.last@example.com} (\href{https://orcid.org/YOUR-ORCID-ID}{ORCID}) +\strong{Maintainer}: Christian Merten \email{christian@merten.dev} + +Authors: +\itemize{ + \item Leon Burgard + \item Josua Kugler +} } \keyword{internal}