These visualizations show the results of a panoramic analysis of António Ramos Rosa’s poetry. The values are expressed in absolute frequency. The quantitative analysis was carried out after removing stop words and lemmatizing the corpus. The use of stopwords leaves articles, conjunctions and other irrelevant words out of the analysis. The lemmatization allows, for example, the 2260 occurrences of the term “palavra” [“word”] to be counted in both singular and plural forms. Consisting of 79 books, the corpus includes 391,890 words, reduced to 181,291 after removing the stopwords. The analyses were developed in R language within the RStudio environment, and the visualizations were produced using RAWGraphs. The code is available for inspection and reuse (↓ script R).
What are the most frequent terms in António Ramos Rosa’s poetry?
↓ data
What is the distribution of the most frequent terms by book?
↓ data
Which terms appear in most books?
↓ data
What is the distribution, by book, of the terms that appear in most books?
↓ data