Bläddra i källkod

Merge branch 'master' of https://git.flavigny.de/christian/hateimparlament

genderequality-alternative
JosuaKugler 4 år sedan
förälder
incheckning
7067877584
4 ändrade filer med 62 tillägg och 14 borttagningar
  1. +2
    -0
      NAMESPACE
  2. +31
    -0
      R/analyze.R
  3. +5
    -0
      README.md
  4. +24
    -14
      vignettes/funwithdata.Rmd

+ 2
- 0
NAMESPACE Visa fil

@@ -1,6 +1,8 @@
# Generated by roxygen2: do not edit by hand # Generated by roxygen2: do not edit by hand


export(fetch_all) export(fetch_all)
export(find_word)
export(join_redner)
export(read_all) export(read_all)
export(read_from_csv) export(read_from_csv)
export(repair) export(repair)


+ 31
- 0
R/analyze.R Visa fil

@@ -0,0 +1,31 @@
#' @export
find_word <- function(res, word) {
talks <- res$talks
mutate(talks, occurences = sapply(str_match_all(talks$content, regex(word, ignore_case = TRUE)),
nrow))
}

#' @export
join_redner <- function(tb, res, fraktion_only = F) {
joined <- left_join(tb, res$redner, by=c("redner" = "id"))
if (fraktion_only) select(joined, "fraktion")
else joined
}

party_colors <- c(
SPD="#DF0B25",
"CDU/CSU"="#000000",
AfD="#1A9FDD",
"AfD&Fraktionslos"="#1A9FDD",
"DIE LINKE"="#BC3475",
"BÜNDNIS 90 / DIE GRÜNEN"="#4A932B",
FDP="#FEEB34",
Fraktionslos="#FEEB34"
)

#' @export
bar_plot_fraktionen <- function(tb) {
ggplot(tb, aes(x = reorder(fraktion, -n), y = n, fill = fraktion)) +
scale_fill_manual(values = party_colors) +
geom_bar(stat = "identity")
}

+ 5
- 0
README.md Visa fil

@@ -22,6 +22,11 @@ Um dokumentationen neu zu laden / zu erstellen (ruft roxgen auf)
document() document()
``` ```


Baue vignetten
```r
rmarkdown::render("vignettes/bla.Rmd")
```

# Herunterladen # Herunterladen


Bevor analysiert werden kann, muss fetch.R ausgeführt werden, um alle Protokolle herunterzuladen. Bevor analysiert werden kann, muss fetch.R ausgeführt werden, um alle Protokolle herunterzuladen.


+ 24
- 14
vignettes/funwithdata.Rmd Visa fil

@@ -29,32 +29,42 @@ fetch_all("../records/") # path to directory where records should be stored
Second, those `.xml` files, need to be parsed into `R` `tibbles`. This is accomplished by: Second, those `.xml` files, need to be parsed into `R` `tibbles`. This is accomplished by:
```r ```r
read_all("../records/") %>% repair() -> res read_all("../records/") %>% repair() -> res

reden <- res$reden
redner <- res$redner
talks <- res$talks
``` ```
We also used `repair` to fix a bunch of formatting issues in the records and unpacked We also used `repair` to fix a bunch of formatting issues in the records and unpacked
the result into more descriptive variables. the result into more descriptive variables.


For development purposes, we load the tables from csv files. For development purposes, we load the tables from csv files.
```{r} ```{r}
tables <- read_from_csv('../csv/')

comments <- tables$comments
reden <- tables$reden
redner <- tables$redner
talks <- tables$talks
res <- read_from_csv('../csv/')
```
and unpack our tibbles
```{r}
comments <- res$comments
reden <- res$reden
redner <- res$redner
talks <- res$talks
``` ```


## Analysis ## Analysis


Now we can start analysing our parsed dataset, e.g. find out which party gives the most talks: Now we can start analysing our parsed dataset, e.g. find out which party gives the most talks:
```{r}
left_join(reden, redner, by=c("redner" = "id")) %>%
```{r, fig.width=10}
join_redner(reden, res) %>%
group_by(fraktion) %>% group_by(fraktion) %>%
summarize(n = n()) %>% summarize(n = n()) %>%
ggplot(aes(x = fraktion, y = n)) +
geom_bar(stat = "identity")
arrange(n) %>%
bar_plot_fraktionen()
``` ```


### Count a word occurence

```{r, fig.width=10}
find_word(res, "hitler") %>%
filter(occurences > 0) %>%
join_redner(res) %>%
select(content, fraktion) %>%
group_by(fraktion) %>%
summarize(n = n()) %>%
arrange(desc(n)) %>%
bar_plot_fraktionen()
```

Laddar…
Avbryt
Spara