Merge branch 'master' into genderequality-alternative

hace 4 años · 605e5e976a
--- a/DESCRIPTION
+++ b/DESCRIPTION
@@ -1,13 +1,21 @@
 Package: hateimparlament
 Title: Protocolanalysis of German Bundestag
 Title: Recordanalysis Of Bundestag
 Version: 0.0.0.9000
 Authors@R: 
    person(given = "First",
           family = "Last",
 Authors@R: c(
    person(given = "Leon",
           family = "Burgard",
           role = c("aut")),
    person(given = "Josua",
           family = "Kugler",
           role = c("aut")),
    person(given = "Christian",
           family = "Merten",
           role = c("aut", "cre"),
           email = "first.last@example.com",
           comment = c(ORCID = "YOUR-ORCID-ID"))
 Description: Downloads, parses and analyses protocols of the current German parliament (Bundestag).
           email = "christian@merten.dev"))
 Description: Downloads, parses and analyses parliamentary records of the 19th legislative
    period of the German parliament (Bundestag).
 URL: https://git.flavigny.de/christian/hateimparlament
 BugReports: https://git.flavigny.de/christian/hateimparlament/issues
 License: GPL (>= 3)
 Encoding: UTF-8
 LazyData: true
--- a/README.md
+++ b/README.md
@@ -1,106 +1,111 @@
 # How to develop
 # Description
 ```r
 # everything works with devtools (loads some other packages too)
 library(devtools)
 R package to analyze parliamentary records of the 19th legislative period of the Bundestag,
 the German parliament.
 # reload all package functions
 load_all()
 # Installation
 #write to CSV files to speed up loading
 tables <- read_all()
 tables <- repair(tables)
 write_to_csv(tables)
 ```
 We NEVER use source(...), etc.! Also NEVER use library(...). 
 But to add new packages (as dependency), use:
 Using the `remotes` package, this is easily installed via:
 ```r
 use_package("my-good-old-package")
 remotes::install_url("https://git.flavigny.de/christian/hateimparlament/archive/master.zip")
 ```
 To make package imports available, you have to add them to `R/hateimparlament-package.R`
 as `@import <package>`.
 To reload / create documentation (calls roxygen)
 ```r
 document()
 ```
 # Features
 Build vignettes
 ```r
 rmarkdown::render("vignettes/bla.Rmd")
 ```
 The package mainly supplies 4 functionalities:
 # Download
 ## Download records
 Before parsing, fetch.R must be run to download all protocols.
 To analyze records, they need to be downloaded. This is done with `fetch_all`:
 ```r
 fetch_all("../inst/records/") # path to directory where records should be stored
 fetch_all("records/", create = TRUE) # path to directory where records should be stored
 ```
 This downloads all parliamentary records and stores them as `.xml` files in the given directory.
 # Parsing
 ## Parse records
 ## tables
 parse.R parses all downloaded logs and creates 5 tibbles.
 repair.R then cleans up the errors in these tibbles.
 To use the records in R, they are converted to `tibble`s with
 ```r
 read_all("../inst/records/") %>% repair()
 res_raw <- read_all("records/") # path to directory where records are stored
 ```
 `res_raw` is a named list with 5 `tibble`s:
 ### Speaker
 structure: `id` , `first_name` , `last_name` , `fraction` , `title` , `role_short`, `role_long`.
 Table of all speakers of this legislative period.
 Obtained from the `<speaker list>` entry at the end of the transcripts.
 Fields:
 - `id`: Unique speaker id
 - `prename`: Prename
 - `lastname`: Surname
 - `fraction`: Name of fraction if the speaker is member of parliament.
 - `title`: Title, e.g. ,,Prof''
 - `role_short`: Short name of role, e.g. ,,Bundeskanzlerin''
 - `role_long`: Long name of role
 ### Speeches
 Structure: `id` , `speaker`
 Table of all speeches given during this legislative period.
 The speeches `id` is specified in the protocol and is unique.A speech is a `<speech>` entry in the session history. A speech always has a main speaker (the one standing at the front of the lectern).
 Fields:
 - `id`: Unique speech id
 - `speaker`: Principal speaker (the person standing behind the lectern during the speech).
 - `date`: Date of session
 Within a speech, there can be different speech entries:
 ### Talks
 - Comments: Applause, interjections, etc.
 - Speeches: Typically mainly the main speaker, but also interjections. 
 These are stored in the talks, comments and applause tables when parsing.
 Within a speech, there can be multiple talks by different people. Mostly this is the main speech
 by the principal speaker, but usually there are questions by other members of parliament or
 order calls by the president of the Bundestag.
 Fields:
 - `speech_id`: Speech in which this talk has been given
 - `speaker`: Person that actually talks
 - `content`: Spoken content
 ### Talks
 ### Comments
 Structure: `speech_id` , `speaker` , `content`.
 These are the interjections that appear during the speeches.
 These are the actual talk entries that appear within _speeches_.
 Fields:
 - `speech_id`: The speech that was interrupted
 - `on_speaker`: The speaker who was interrupted
 - `fraction`: The fraction of the commenter
 - `commenter`: The person who interrupted the speech
 - `comment`: The content of the comment
 - `speech_id`: the speech in which the contribution appears.
 - `speaker`: The speaker of the speech entry.
 - `content`: The content of the speech.
 ### Applause
 ###comments
 Table containing all the rounds of applause that happened during this legislative period.
 These are the interjections that appear during the speeches.
 Fields:
 - `speech_id`: Speech during which was applauded
 - `on_speaker`: Speaker who was applauded
 They have the following structure:
 - `speech_id`: The speech that was interrupted.
 - `on_speaker`: The speaker who was interrupted.
 - `fraction`
 - `commenter`: The person who interrupted the speech.
 - `comment`: The content of the comment.
 And then logical fields `CDU_CSU`, `SPD`, `FDP`, `DIE_LINKE`, `BUENDNIS_90_DIE_GRUENEN`, `AfD`
 for every fraction in the Bundestag, signifying whether this fraction applauded.
 ###applause
 ## Repair records
 The logical table shows which party applauded for which speaker with explicit speech and which did not.
 The parliamentary records usually contain some major and minor formatting issues. These are
 mostly resolved by using
 ```
 res <- repair(res_raw)
 ```
 By passing `lookup_speaker = TRUE`, even commenters in
 `res_raw$comments` are matched with their respective speaker id.
 structure: `speech_id`, `on_speaker`, `CDU_CSU`, `SPD`, `FDP`, `DIE_LINKE`, `BUENDNIS_90_DIE_GRUENEN`, `AfD`
 ## Analysis
 Also some functions are provided to analyze the parliamentary records and draw some plots:
 # Analysis
 - `bar_plot_fractions`
 - `find_word`
 - `join_speaker`
 - `word_usage_by_date`
 analysis.R provides some functions to analyze the "Plenarprotokolle" and to create plots.
 See their usage with the `?` operator.
 In the vignettes you can find different analyses of the protocols, for example:
@@ -110,4 +115,44 @@ In the vignettes you can find different analyses of the protocols, for example:
 - "When are which topics discussed the most?"
 - ...
 # Contributing
 Developing works the easiest with `devtools`:
 ```r
 library(devtools)
 ```
 When you changed something or added some functionality, you can reload all package functions with
 ```r
 load_all()
 ```
 If you want to avoid reading all records every time you start a new R session, you can
 write your parsed tibbles to CSV files:
 ```
 tables <- read_all()
 tables <- repair(tables)
 write_to_csv(tables, "path/to/csv/")
 ```
 Then later you can use
 ```r
 res <- read_from_csv("path/to/csv/")
 ```
 to load your stored tibbles very fast.
 NEVER use source(...), etc.! Also NEVER use library(...). 
 To add new packages (as dependency), use:
 ```r
 use_package("my-good-old-package")
 ```
 To make package imports available, you have to add them to `R/hateimparlament-package.R`
 as `@import <package>`.
 To reload / create documentation (calls roxygen)
 ```r
 document()
 ```
 Build vignettes
 ```r
 rmarkdown::render("vignettes/test.Rmd")
 ```
--- a/man/hateimparlament-package.Rd
+++ b/man/hateimparlament-package.Rd
@@ -4,15 +4,30 @@
 \name{hateimparlament-package}
 \alias{hateimparlament}
 \alias{hateimparlament-package}
 \title{hateimparlament: Protocolanalysis of German Bundestag}
 \title{hateimparlament: Recordanalysis Of Bundestag}
 \description{
 Downloads, parses and analyses protocols of the current German parliament (Bundestag).
 Downloads, parses and analyses parliamentary records of the 19th legislative
    period of the German parliament (Bundestag).
 }
 \details{
 hateimparlament ist ein großartiges Paket!
 }
 \seealso{
 Useful links:
 \itemize{
  \item \url{https://git.flavigny.de/christian/hateimparlament}
  \item Report bugs at \url{https://git.flavigny.de/christian/hateimparlament/issues}
 }
 }
 \author{
 \strong{Maintainer}: First Last \email{first.last@example.com} (\href{https://orcid.org/YOUR-ORCID-ID}{ORCID})
 \strong{Maintainer}: Christian Merten \email{christian@merten.dev}
 Authors:
 \itemize{
  \item Leon Burgard
  \item Josua Kugler
 }
 }
 \keyword{internal}