An R package to analyze the parliamentary records of the 19th legislative period of the Bundestag, the German parliament.

flavis 03f8ca0813 add implementation beamer slides		4 лет назад
R	add rotatelab to bar_plot_fractions	4 лет назад
inst	add implementation beamer slides	4 лет назад
man	add rotatelab to bar_plot_fractions	4 лет назад
vignettes	add rotatelab to bar_plot_fractions	4 лет назад
.Rbuildignore	fix vignette	4 лет назад
.gitignore	refactor again because of check complaining	4 лет назад
DESCRIPTION	solve most of predefined challenges	4 лет назад
NAMESPACE	refactor rede -> speech, redner -> speaker	4 лет назад
README.md	update README	4 лет назад

README.md

How to develop

# everything works with devtools (loads some other packages too)
library(devtools)

# reload all package functions
load_all()

#write to CSV files to speed up loading
tables <- read_all()
tables <- repair(tables)
write_to_csv(tables)

We NEVER use source(...), etc.! Also NEVER use library(...). But to add new packages (as dependency), use:

use_package("my-good-old-package")

To make package imports available, you have to add them to R/hateimparlament-package.R as @import <package>.

To reload / create documentation (calls roxygen)

document()

Build vignettes

rmarkdown::render("vignettes/bla.Rmd")

Download

Before parsing, fetch.R must be run to download all protocols.

fetch_all("../inst/records/") # path to directory where records should be stored

Parsing

tables

parse.R parses all downloaded logs and creates 5 tibbles. repair.R then cleans up the errors in these tibbles.

read_all("../inst/records/") %>% repair()

Speaker

structure: id , first_name , last_name , fraction , title , role_short, role_long.

Obtained from the <speaker list> entry at the end of the transcripts.

Speeches

Structure: id , speaker

The speeches id is specified in the protocol and is unique.A speech is a <speech> entry in the session history. A speech always has a main speaker (the one standing at the front of the lectern).

Within a speech, there can be different speech entries:

Comments: Applause, interjections, etc.
Speeches: Typically mainly the main speaker, but also interjections. These are stored in the talks, comments and applause tables when parsing.

Talks

Structure: speech_id , speaker , content.

These are the actual talk entries that appear within speeches.

speech_id: the speech in which the contribution appears.
speaker: The speaker of the speech entry.
content: The content of the speech.

###comments

These are the interjections that appear during the speeches.

They have the following structure:

speech_id: The speech that was interrupted.
on_speaker: The speaker who was interrupted.
fraction
commenter: The person who interrupted the speech.
comment: The content of the comment.

###applause

The logical table shows which party applauded for which speaker with explicit speech and which did not.

structure: speech_id, on_speaker, CDU_CSU, SPD, FDP, DIE_LINKE, BUENDNIS_90_DIE_GRUENEN, AfD

Analysis

analysis.R provides some functions to analyze the “Plenarprotokolle” and to create plots.

In the vignettes you can find different analyses of the protocols.

How to develop

# alles geht mit devtools (laedt auch noch ein paar andere pakete)
library(devtools)

# neu laden aller paket funktionen
load_all()

#In CSV-files schreiben, um das laden zu beschleunigen
tables <- read_all()
tables <- repair(tables)
write_to_csv(tables)

Wir verwenden NIEMALS source, etc.! Außerdem NIEMALS library(...) verwenden, sondern um neue pakete hinzuzufuegen (als dependency), verwende:

use_package("my-good-old-package")

Um paket imports verfuegbar zu machen, muss man diese in R/hateimparlament-package.R als @import <package> hinzufuegen.

Um dokumentationen neu zu laden / zu erstellen (ruft roxgen auf)

document()

Baue vignetten

rmarkdown::render("vignettes/bla.Rmd")

Herunterladen

Bevor analysiert werden kann, muss fetch.R ausgeführt werden, um alle Protokolle herunterzuladen.

Parsing

Tabellen

parse.R parsed einzelne Protokolle und erstellt 5 Tibbles

Redner

Struktur: id , vorname , nachname , fraction , titel , rolle_kurz, rolle_lang

Die Rollen sind beispielsweise “Bundeskanzlerin”. Leider gegendert und deshalb wahrscheinlich nervig zu analysieren.

Wird gewonnnen aus dem <rednerliste> Eintrag am Ende der Protokolle.

Reden

Struktur: id , redner

Die Reden id wird im Protokoll festgelegt und ist eindeutig. Eine Rede ist ein <rede> Eintrag im Sitzungsverlauf. Eine Rede hat immer einen Hauptredner (der der vorne am Pult steht).

Innerhalb einer Rede kann es verschieden Redebeiträge geben:

Kommentare: Beifall, Zwischenrufe, etc.
Redebeiträge: Typischerweise hauptsächlich der Hauptredner, aber auch Zwischenfragen. Diese werden beim parsen in der Tabelle Talks gespeichert.

Talks

Struktur: rede_id , redner , content

Das sind die eigentlichen Redebeiträge, die innerhalb von rede Einträgen auftauchen. Dabei gilt:

rede_id: Die Rede in dem der Beitrag auftaucht
redner: Der Sprecher des Redebeitrags
content: Der Inhalt der Rede (wichtig: Aktuell werden die Ordnungskommentare des Bundestagspräsidenten nicht herausgefiltert, tauchen also im Inhalt auf, obwohl sie nicht vom redner gesprochen werden. To be fixed -> Issues!)

Noch zu parsen: Alles kann, nichts muss.

Kommentare (aktuell werden nur <p>'s in Reden gesammelt). Hier ist zu überlegen, wie diese gesammelt werden sollten.
Meta Daten? Diese sind teilweise in den rede_id's encoded.

Kombinieren der Tabellen der Protokolle

Alle Tabellen sollten schlussendlich kombiniert werden zu großen Tabellen über alle Protokolle.

Analyse

Schnittmenge AfD Vokabular und Hitler's Reden?
Redeanteile nach Geschlecht (dazu gibt es leider keine Daten in der Rednerliste), Fraktion, etc.
Ideen, Ideen, Ideen ...