9 R: Basics
9.2 Data Types
R has several built-in data types that are crucial for data analysis and computation. The commonly used data types include:
- Numeric (double and integer)
- Character (strings)
- Logical (TRUE or FALSE)
The data type of a variable/constant in R can be identified using the built-in functions class() or typeof(). For example, the following variables and their values demonstrate different data types:
9.2.1 Numeric Data
In R, numeric data consists of two types:
- Double (default) – Represents floating-point numbers with decimal precision.
- Integer – Represents whole numbers, defined explicitly using the
Lsuffix.
x <- 10 # Default: Double
y <- 10L # Explicit Integer
typeof(x) # "double"[1] "double"
typeof(y) # "integer"[1] "integer"
9.2.2 Character Data
R Strings can be created by assigning character values to a variable. These strings can be further concatenated by using various functions and methods to form a big string.
# R program for String Creation
# creating a string with double quotes
str1 <- "OK1"
cat ("String 1 is : ", str1)String 1 is : OK1
# creating a string with single quotes
str2 <- 'OK2'
cat ("String 2 is : ", str2)String 2 is : OK2
str3 <- "This is 'acceptable and 'allowed' in R"
cat ("String 3 is : ", str3)String 3 is : This is 'acceptable and 'allowed' in R
str4 <- 'Hi, Wondering "if this "works"'
cat ("String 4 is : ", str4)String 4 is : Hi, Wondering "if this "works"
R Strings are always stored as double-quoted values. A double-quoted string can contain single quotes within it. Single-quoted strings can’t contain single quotes. Similarly, double quotes can’t be surrounded by double quotes.
9.2.3 Logical Data
In R, logical data consists of values that represent Boolean (TRUE/FALSE) conditions.
Logical values are used in conditional statements, indexing, and logical operations.
R has three logical values: - TRUE (can be written as T) - FALSE (can be written as F) - NA (Logical missing value)
x <- TRUE
y <- FALSE
z <- NA # Missing logical valuetypeof(x)[1] "logical"
typeof(y)[1] "logical"
typeof(z)[1] "logical"
9.3 Variables
We have the following rules for a R variable name:
- A variable name must start with a letter and can be a combination of letters, digits, period(.) and underscore(_). If it starts with period(.), it cannot be followed by a digit.
- A variable name cannot start with a number or underscore (_)
- Variable names are case-sensitive (age, Age and AGE are three different variables)
- Reserved words cannot be used as variables (TRUE, FALSE, NULL, if…)
9.3.1 The assignment operator
9.3.1.1 Using <- (Preferred Operator)
The <- operator is the standard way to assign values in R:
x <- 10
y <- "Hello, R!"
z <- TRUE9.3.1.2 Using = (Not Recommended)
Although = can be used for assignment, it is generally not recommended because it can cause issues in function arguments:
x = 10 # Works, but `<-` is preferred💡 Best Practice:
- Always use<-for assignments to avoid ambiguity.
9.4 Converting datatypes
Sometimes a value may have a datatype that is not suitable for using it. For example, consider the variable called annual_income in the code below:
annual_income <- "80000"Suppose we wish to divide annual_income by 12 to get the monthly income. We cannot use the variable annual_income directly as its datatype is a string and not a number. Thus, numerical operations cannot be performed on the variable annual_income.
We’ll need to convert annual_income to an integer. For that we will use the R’s in-built as.integer() function:
annual_income <- as.integer(annual_income)
monthly_income <- annual_income/12
print(paste0("monthly income = ", monthly_income))[1] "monthly income = 6666.66666666667"
Similarly, datatypes can be converted from one type to another using in-built R functions as shown below:
#Converting integer to character
as.character(9)[1] "9"
#Converting character to numeric
as.numeric('9.4')[1] 9.4
#Converting logical to integer
as.numeric(FALSE)[1] 0
Note that any non-zero numeric value, if converted to the ‘logical’ datatype, will return TRUE, while converting 0 to the ‘logical’ datatype will return FALSE. Only numeric values can be converted to the ‘logical’ datatype.
# Converting integer to logical
as.logical(40)[1] TRUE
# Converting integer to logical
as.logical(0)[1] FALSE
# Converting integer to logical
as.logical(-30.1)[1] TRUE
Sometimes, conversion of a value may not be possible. For example, it is not possible to convert the variable greeting defined below to a number:
greeting <- "hello"
as.numeric(greeting)Warning: NAs introduced by coercion
[1] NA
However, strings can be concatenated using the paste0() function:
paste0("hello", " there!")[1] "hello there!"
The following table summarizes how to convert between Numeric, Character, and Logical types in R:
| From → To | Conversion Function | Example Usage | Notes | Failure Behavior |
|---|---|---|---|---|
| Numeric → Character | as.character(x) |
as.character(42) → "42" |
Converts numbers to strings | Not applicable (always succeeds) |
| Numeric → Logical | as.logical(x) |
as.logical(0) → FALSE |
0 is FALSE, non-zero is TRUE |
Returns NA if input is not numeric |
| Character → Numeric | as.numeric(x) |
as.numeric("3.14") → 3.14 |
Returns NA if conversion fails |
Returns NA if conversion fails |
| Character → Logical | as.logical(x) |
as.logical("TRUE") → TRUE |
Case-sensitive, "TRUE" and "FALSE" work |
Returns NA if input is not "TRUE" or "FALSE" |
| Logical → Numeric | as.numeric(x) |
as.numeric(TRUE) → 1 |
TRUE = 1, FALSE = 0 |
Returns NA if input is not logical |
| Logical → Character | as.character(x) |
as.character(FALSE) → "FALSE" |
Converts logical values to strings | Not applicable (always succeeds) |
Note: Always verify conversions using class() or typeof() to ensure expected results.
9.5 typeof() vs. class() in R
In R, typeof() and class() are used to determine different aspects of an object’s type:
typeof(): Returns the low-level storage mode of an object.class(): Returns the high-level classification of an object.
x <- 10L # Integer value
typeof(x) # "integer"[1] "integer"
class(x) # "integer"[1] "integer"
y <- c(1, 2, 3) # Numeric vector
typeof(y) # "double"[1] "double"
class(y) # "numeric"[1] "numeric"
9.5.1 Key Differences:
| Function | Description | Example Output |
|---|---|---|
typeof() |
Shows how data is stored in memory | "double", "integer", "character" |
class() |
Shows the high-level classification (for objects) | "numeric", "factor", "data.frame" |
9.5.2 Common Data Types
| Data Type | typeof() Output |
class() Output |
Example |
|---|---|---|---|
| Numeric | "double" |
"numeric" |
x <- 3.14 |
| Integer | "integer" |
"integer" |
x <- 10L |
| Character | "character" |
"character" |
x <- "hello" |
| Logical | "logical" |
"logical" |
x <- TRUE |
| Data Frame | "list" |
"data.frame" |
x <- data.frame(a = 1:3, b = c("A", "B", "C")) |
9.6 Displaying information
9.6.1 Using print()
The print() function is the most basic way to display output.
x <- "Hello, R!"
print(x)[1] "Hello, R!"
9.6.2 Using cat()
The cat() function concatenates and prints text without quotes.
name <- "Alice"
cat("Hello,", name, "!\n")Hello, Alice !
cat()does not return a value; it just displays output.\nadds a new line.
9.6.3 Using paste()
The paste() function concatenates text elements into a single string.
name <- "Alice"
paste("Hello,", name, "!\n")[1] "Hello, Alice !\n"
paste()returns a character string.paste0()is a variant ofpaste()that does not add spaces between elements.- To print the result without quotes, use
cat()
9.6.4 Using message()
The message() function is useful for warnings or informational messages.
message("This is a message!")This is a message!
Unlike print(),message() does not print inside RMarkdown unless message=TRUE in chunk options.
9.6.5 Using sprintf()
For formatted output, use sprintf():
name <- "Alice"
age <- 25
sprintf("My name is %s, and I am %d years old.", name, age)[1] "My name is Alice, and I am 25 years old."
9.7 Taking user input
R’s in-built readline() function can be used to accept an input from the user. For example, suppose we wish the user to input their age:
user_name <- readline(prompt="Enter your name: ")Enter your name:
cat("Hello,", user_name, "!\n")Hello, !
Since RMarkdown is non-interactive, readline() will not work inside a notebook. Instead, you can assign input directly for demonstration:
user_name <- "Alice"
cat("Hello,", user_name, "!\n")Hello, Alice !
When using readline(), the input is always a *character string**, and it must be converted explicitly to numeric before performing calculations.
age <- as.numeric(readline(prompt="Enter your age: "))Enter your age:
cat("You are", age, "years old.\n")You are NA years old.
9.8 Arithmetic Operations
R supports standard arithmetic operations for numeric values.
| Operation | Symbol | Example | Result |
|---|---|---|---|
| Addition | + |
5 + 3 |
8 |
| Subtraction | - |
10 - 4 |
6 |
| Multiplication | * |
6 * 2 |
12 |
| Division | / |
8 / 2 |
4 |
| Exponentiation | ^ or ** |
3^2 or 3**2 |
9 |
| Integer Division | %/% |
10 %/% 3 |
3 |
| Modulo (Remainder) | %% |
10 %% 3 |
1 |
💡 Note:
- Integer division%/%returns the quotient without the remainder.
- Modulo%%returns the remainder after division.
9.9 Comparison Operators
Comparison operations return TRUE or FALSE, often used for conditions.
| Operation | Symbol | Example | Result |
|---|---|---|---|
| Greater than | > |
5 > 3 |
TRUE |
| Less than | < |
2 < 1 |
FALSE |
| Greater than or equal to | >= |
4 >= 4 |
TRUE |
| Less than or equal to | <= |
6 <= 5 |
FALSE |
| Equal to | == |
5 == 5 |
TRUE |
| Not equal to | != |
3 != 2 |
TRUE |
💡 Note:
- Always use==for comparison (not=).
-!=checks if values are different.
9.10 Logical Operators
Logical operators are used to combine conditions in R. There are two types of logical operators:
Element-wise operators: & (AND), | (OR), and ! (NOT) – work element-by-element for vectors. Short-circuit operators: && (AND), || (OR) – only evaluate the first element of each condition, primarily used in control flow (e.g., if statements).
| Operator | Symbol | Description | Example | Result |
|---|---|---|---|---|
| AND | & |
Element-wise AND (Both must be TRUE) |
(5 > 3) & (2 < 4) |
TRUE |
| OR | | |
Element-wise OR (At least one must be TRUE) |
(5 > 3) | (2 > 4) |
TRUE |
| NOT | ! |
Negates a logical value | !(5 > 3) |
FALSE |
| Short-circuit AND | && |
Evaluates only the first element | TRUE && FALSE |
FALSE |
| Short-circuit OR | || |
Evaluates only the first element | TRUE || FALSE |
TRUE |
**Difference between element-wise and short-circuit operators
9.11 2. Differences Between Element-wise and Short-circuit Operators
| Operator Type | Symbol | Works on Vectors? | Use Case |
|---|---|---|---|
| Element-wise AND | & |
✅ Yes | Use with vectors or data frames |
| Element-wise OR | | |
✅ Yes | Use with vectors or data frames |
| Short-circuit AND | && |
❌ No (only first element) | Use in if statements |
| Short-circuit OR | || |
❌ No (only first element) | Use in if statements |
💡 Key Takeaways:
- Use&and|for vector operations (e.g., filtering in data frames).
- Use&&and||inifstatements for better efficiency. - The!operator negates logical values (useful for filtering and reversing conditions).
Examples:
x <- c(TRUE, FALSE, TRUE)
y <- c(FALSE, TRUE, TRUE)
# Element-wise AND
x & y [1] FALSE FALSE TRUE
# Element-wise OR
x | y [1] TRUE TRUE TRUE
a <- 10
b <- 5
if (a > 0 && b > 0) {
print("Both are positive")
}[1] "Both are positive"
if (a > 0 || b < 0) {
print("At least one condition is met")
}[1] "At least one condition is met"
x <- c(TRUE, FALSE, TRUE)
!x # [FALSE TRUE FALSE][1] FALSE TRUE FALSE
9.1 Comments
Comments in R start with the
#symbol. Everything after#on a line is ignored by R.R does not support multi-line comments like Python’s
""", but you can simulate them using multiple#symbols:To comment a block of code quickly in RStudio, use:
Ctrl + Shift + C (Windows/Linux)Cmd + Shift + C (Mac)