The WordLens-Project
  • The WordLens-Project
  • Course Overview
  • Part 1: Transform and Visualize Data
    • 1 Working Environment
    • 2 R and the Tidyverse
    • 3 Data Loading
      • Tabular Data
      • Tidy Data
      • Exploring New Data
    • 4 Data Transformation
      • Select Columns
      • Filter Rows
      • Sort Rows
      • Add Or Change Columns
        • Calculate New Columns
        • Change Data Types
        • Rename Columns
        • Joining Data Sets
      • Summarize Rows
    • 5 Data Visualization
      • Pleas for Visualization
      • Fast and Simple Plots
      • Grammar of Graphics
  • Part 2: Rule-Based NLP
    • 6 Unstructured Data
    • 7 Searching Text
    • 8 Tokenizing Text
      • Filter or Sample Data
      • Clean and Normalize Text
      • Split Text Into Tokens
      • Removing Stop Words
      • Enrich Tokens
    • 9 Topic Classification
      • Deductive
      • Inductive
    • 10 Sentiment Analysis
    • 11 Text Classification
    • 12 Word Pairs and N-Grams
  • Part 3: NLP with Machine Learning
    • 13 Text Embeddings
    • 14 Part-Of-Speech
    • 15 Named Entities
    • 16 Syntactic Dependency
    • 17 Similarity
    • 18 Sentiment
    • 19 Text Classification
    • 20 Transformers
    • 21 Training a Model
    • 22 Large Language Models
  • Appendix
  • Resources
Powered by GitBook
On this page
  • What is R?
  • The Tidyverse
  • Installation
  1. Part 1: Transform and Visualize Data

2 R and the Tidyverse

Previous1 Working EnvironmentNext3 Data Loading

Last updated 2 years ago

What is R?

R is a programming language and environment for statistical computing and graphics, which was created in 1993 by and at the University of Auckland, New Zealand. The name R is derived from the first letters of the creators' names and also serves as a play on the name of the , which inspired R.

The R language has its roots in the S programming language, developed at Bell Laboratories in the 1970s by John Chambers and his colleagues. The S language provided an interactive programming environment for data analysis and graphics, which influenced the development of R. Over the years, R has evolved into a popular open-source platform, boasting a large and active community of users and developers.

R is a versatile programming language used for various purposes, including data manipulation, statistical analysis, and data visualization. It is particularly popular among statisticians, data scientists, and researchers due to its extensive library of packages, which extend its functionality and make it easier to perform complex tasks. In addition, R's open-source nature means that users can contribute new packages and improve existing ones, resulting in a constantly growing ecosystem.

The Tidyverse

The Tidyverse is a collection of R packages designed to simplify and streamline the data analysis process. The term "Tidyverse" was coined by , a prominent R developer and data scientist, in 2016. The Tidyverse packages share a and are built to work together seamlessly. The main goal of the Tidyverse is to provide a consistent and user-friendly approach to data analysis, making it easier for users to work with data in R.

Some of the most popular Tidyverse packages include:

  • readr: A package for importing and exporting data in various formats

  • dplyr: A package for data manipulation, with tools for filtering, sorting, and aggregating data

  • tidyr: A package for cleaning and reshaping data

  • ggplot2: A powerful and flexible package for creating data visualizations

In this course, you'll learn about and hands-on apply all of them in more or less depth.

Installation

You need to install the tidyverse package once and can then use it in your scripts. Installing it is done with one line of code that you need to run once:

install.packages("tidyverse")

When installed, we can load the library like any other. It is good practice to load all required libraries at the beginning of a script:

library(tidyverse)
Ross Ihaka
Robert Gentleman
S programming language
Hadley Wickham
common design philosophy