9 R: Basics
9.2 Data Types
R has several built-in data types that are crucial for data analysis and computation. The commonly used data types include:
- Numeric (double and integer)
- Character (strings)
- Logical (TRUE or FALSE)
The data type of a variable/constant in R can be identified using the built-in functions class()
or typeof()
. For example, the following variables and their values demonstrate different data types:
9.2.1 Numeric Data
In R, numeric data consists of two types:
- Double (default) – Represents floating-point numbers with decimal precision.
- Integer – Represents whole numbers, defined explicitly using the
L
suffix.
<- 10 # Default: Double
x <- 10L # Explicit Integer
y
typeof(x) # "double"
[1] "double"
typeof(y) # "integer"
[1] "integer"
9.2.2 Character Data
R Strings can be created by assigning character values to a variable. These strings can be further concatenated by using various functions and methods to form a big string.
# R program for String Creation
# creating a string with double quotes
<- "OK1"
str1 cat ("String 1 is : ", str1)
String 1 is : OK1
# creating a string with single quotes
<- 'OK2'
str2 cat ("String 2 is : ", str2)
String 2 is : OK2
<- "This is 'acceptable and 'allowed' in R"
str3 cat ("String 3 is : ", str3)
String 3 is : This is 'acceptable and 'allowed' in R
<- 'Hi, Wondering "if this "works"'
str4 cat ("String 4 is : ", str4)
String 4 is : Hi, Wondering "if this "works"
R Strings are always stored as double-quoted values. A double-quoted string can contain single quotes within it. Single-quoted strings can’t contain single quotes. Similarly, double quotes can’t be surrounded by double quotes.
9.2.3 Logical Data
In R, logical data consists of values that represent Boolean (TRUE/FALSE) conditions.
Logical values are used in conditional statements, indexing, and logical operations.
R has three logical values: - TRUE
(can be written as T
) - FALSE
(can be written as F
) - NA
(Logical missing value)
<- TRUE
x <- FALSE
y <- NA # Missing logical value z
typeof(x)
[1] "logical"
typeof(y)
[1] "logical"
typeof(z)
[1] "logical"
9.3 Variables
We have the following rules for a R variable name:
- A variable name must start with a letter and can be a combination of letters, digits, period(.) and underscore(_). If it starts with period(.), it cannot be followed by a digit.
- A variable name cannot start with a number or underscore (_)
- Variable names are case-sensitive (age, Age and AGE are three different variables)
- Reserved words cannot be used as variables (TRUE, FALSE, NULL, if…)
9.3.1 The assignment operator
9.3.1.1 Using <-
(Preferred Operator)
The <-
operator is the standard way to assign values in R:
<- 10
x <- "Hello, R!"
y <- TRUE z
9.3.1.2 Using =
(Not Recommended)
Although =
can be used for assignment, it is generally not recommended because it can cause issues in function arguments:
= 10 # Works, but `<-` is preferred x
💡 Best Practice:
- Always use<-
for assignments to avoid ambiguity.
9.4 Converting datatypes
Sometimes a value may have a datatype that is not suitable for using it. For example, consider the variable called annual_income
in the code below:
<- "80000" annual_income
Suppose we wish to divide annual_income
by 12 to get the monthly income. We cannot use the variable annual_income
directly as its datatype is a string and not a number. Thus, numerical operations cannot be performed on the variable annual_income
.
We’ll need to convert annual_income to an integer. For that we will use the R’s in-built as.integer() function:
<- as.integer(annual_income)
annual_income <- annual_income/12
monthly_income print(paste0("monthly income = ", monthly_income))
[1] "monthly income = 6666.66666666667"
Similarly, datatypes can be converted from one type to another using in-built R functions as shown below:
#Converting integer to character
as.character(9)
[1] "9"
#Converting character to numeric
as.numeric('9.4')
[1] 9.4
#Converting logical to integer
as.numeric(FALSE)
[1] 0
Note that any non-zero numeric value, if converted to the ‘logical’ datatype, will return TRUE
, while converting 0 to the ‘logical’ datatype will return FALSE
. Only numeric values can be converted to the ‘logical’ datatype.
# Converting integer to logical
as.logical(40)
[1] TRUE
# Converting integer to logical
as.logical(0)
[1] FALSE
# Converting integer to logical
as.logical(-30.1)
[1] TRUE
Sometimes, conversion of a value may not be possible. For example, it is not possible to convert the variable greeting defined below to a number:
<- "hello"
greeting as.numeric(greeting)
Warning: NAs introduced by coercion
[1] NA
However, strings can be concatenated using the paste0()
function:
paste0("hello", " there!")
[1] "hello there!"
The following table summarizes how to convert between Numeric, Character, and Logical types in R:
From → To | Conversion Function | Example Usage | Notes | Failure Behavior |
---|---|---|---|---|
Numeric → Character | as.character(x) |
as.character(42) → "42" |
Converts numbers to strings | Not applicable (always succeeds) |
Numeric → Logical | as.logical(x) |
as.logical(0) → FALSE |
0 is FALSE , non-zero is TRUE |
Returns NA if input is not numeric |
Character → Numeric | as.numeric(x) |
as.numeric("3.14") → 3.14 |
Returns NA if conversion fails |
Returns NA if conversion fails |
Character → Logical | as.logical(x) |
as.logical("TRUE") → TRUE |
Case-sensitive, "TRUE" and "FALSE" work |
Returns NA if input is not "TRUE" or "FALSE" |
Logical → Numeric | as.numeric(x) |
as.numeric(TRUE) → 1 |
TRUE = 1 , FALSE = 0 |
Returns NA if input is not logical |
Logical → Character | as.character(x) |
as.character(FALSE) → "FALSE" |
Converts logical values to strings | Not applicable (always succeeds) |
Note: Always verify conversions using class()
or typeof()
to ensure expected results.
9.5 typeof()
vs. class()
in R
In R, typeof()
and class()
are used to determine different aspects of an object’s type:
typeof()
: Returns the low-level storage mode of an object.class()
: Returns the high-level classification of an object.
<- 10L # Integer value
x typeof(x) # "integer"
[1] "integer"
class(x) # "integer"
[1] "integer"
<- c(1, 2, 3) # Numeric vector
y typeof(y) # "double"
[1] "double"
class(y) # "numeric"
[1] "numeric"
9.5.1 Key Differences:
Function | Description | Example Output |
---|---|---|
typeof() |
Shows how data is stored in memory | "double" , "integer" , "character" |
class() |
Shows the high-level classification (for objects) | "numeric" , "factor" , "data.frame" |
9.5.2 Common Data Types
Data Type | typeof() Output |
class() Output |
Example |
---|---|---|---|
Numeric | "double" |
"numeric" |
x <- 3.14 |
Integer | "integer" |
"integer" |
x <- 10L |
Character | "character" |
"character" |
x <- "hello" |
Logical | "logical" |
"logical" |
x <- TRUE |
Data Frame | "list" |
"data.frame" |
x <- data.frame(a = 1:3, b = c("A", "B", "C")) |
9.6 Displaying information
9.6.1 Using print()
The print()
function is the most basic way to display output.
<- "Hello, R!"
x print(x)
[1] "Hello, R!"
9.6.2 Using cat()
The cat()
function concatenates and prints text without quotes.
<- "Alice"
name cat("Hello,", name, "!\n")
Hello, Alice !
cat()
does not return a value; it just displays output.\n
adds a new line.
9.6.3 Using paste()
The paste()
function concatenates text elements into a single string.
<- "Alice"
name paste("Hello,", name, "!\n")
[1] "Hello, Alice !\n"
paste()
returns a character string.paste0()
is a variant ofpaste()
that does not add spaces between elements.- To print the result without quotes, use
cat()
9.6.4 Using message()
The message()
function is useful for warnings or informational messages.
message("This is a message!")
This is a message!
Unlike print()
,message()
does not print inside RMarkdown unless message=TRUE
in chunk options.
9.6.5 Using sprintf()
For formatted output, use sprintf()
:
<- "Alice"
name <- 25
age sprintf("My name is %s, and I am %d years old.", name, age)
[1] "My name is Alice, and I am 25 years old."
9.7 Taking user input
R’s in-built readline()
function can be used to accept an input from the user. For example, suppose we wish the user to input their age:
<- readline(prompt="Enter your name: ") user_name
Enter your name:
cat("Hello,", user_name, "!\n")
Hello, !
Since RMarkdown is non-interactive, readline()
will not work inside a notebook. Instead, you can assign input directly for demonstration:
<- "Alice"
user_name cat("Hello,", user_name, "!\n")
Hello, Alice !
When using readline()
, the input is always a *character string**, and it must be converted explicitly to numeric before performing calculations.
<- as.numeric(readline(prompt="Enter your age: ")) age
Enter your age:
cat("You are", age, "years old.\n")
You are NA years old.
9.8 Arithmetic Operations
R supports standard arithmetic operations for numeric values.
Operation | Symbol | Example | Result |
---|---|---|---|
Addition | + |
5 + 3 |
8 |
Subtraction | - |
10 - 4 |
6 |
Multiplication | * |
6 * 2 |
12 |
Division | / |
8 / 2 |
4 |
Exponentiation | ^ or ** |
3^2 or 3**2 |
9 |
Integer Division | %/% |
10 %/% 3 |
3 |
Modulo (Remainder) | %% |
10 %% 3 |
1 |
💡 Note:
- Integer division%/%
returns the quotient without the remainder.
- Modulo%%
returns the remainder after division.
9.9 Comparison Operators
Comparison operations return TRUE
or FALSE
, often used for conditions.
Operation | Symbol | Example | Result |
---|---|---|---|
Greater than | > |
5 > 3 |
TRUE |
Less than | < |
2 < 1 |
FALSE |
Greater than or equal to | >= |
4 >= 4 |
TRUE |
Less than or equal to | <= |
6 <= 5 |
FALSE |
Equal to | == |
5 == 5 |
TRUE |
Not equal to | != |
3 != 2 |
TRUE |
💡 Note:
- Always use==
for comparison (not=
).
-!=
checks if values are different.
9.10 Logical Operators
Logical operators are used to combine conditions in R. There are two types of logical operators:
Element-wise operators: &
(AND), |
(OR), and !
(NOT) – work element-by-element for vectors. Short-circuit operators: &&
(AND), ||
(OR) – only evaluate the first element of each condition, primarily used in control flow (e.g., if statements).
Operator | Symbol | Description | Example | Result |
---|---|---|---|---|
AND | & |
Element-wise AND (Both must be TRUE ) |
(5 > 3) & (2 < 4) |
TRUE |
OR | | |
Element-wise OR (At least one must be TRUE ) |
(5 > 3) | (2 > 4) |
TRUE |
NOT | ! |
Negates a logical value | !(5 > 3) |
FALSE |
Short-circuit AND | && |
Evaluates only the first element | TRUE && FALSE |
FALSE |
Short-circuit OR | || |
Evaluates only the first element | TRUE || FALSE |
TRUE |
**Difference between element-wise and short-circuit operators
9.11 2. Differences Between Element-wise and Short-circuit Operators
Operator Type | Symbol | Works on Vectors? | Use Case |
---|---|---|---|
Element-wise AND | & |
✅ Yes | Use with vectors or data frames |
Element-wise OR | | |
✅ Yes | Use with vectors or data frames |
Short-circuit AND | && |
❌ No (only first element) | Use in if statements |
Short-circuit OR | || |
❌ No (only first element) | Use in if statements |
💡 Key Takeaways:
- Use&
and|
for vector operations (e.g., filtering in data frames).
- Use&&
and||
inif
statements for better efficiency. - The!
operator negates logical values (useful for filtering and reversing conditions).
Examples:
<- c(TRUE, FALSE, TRUE)
x <- c(FALSE, TRUE, TRUE)
y
# Element-wise AND
& y x
[1] FALSE FALSE TRUE
# Element-wise OR
| y x
[1] TRUE TRUE TRUE
<- 10
a <- 5
b
if (a > 0 && b > 0) {
print("Both are positive")
}
[1] "Both are positive"
if (a > 0 || b < 0) {
print("At least one condition is met")
}
[1] "At least one condition is met"
<- c(TRUE, FALSE, TRUE)
x !x # [FALSE TRUE FALSE]
[1] FALSE TRUE FALSE
9.1 Comments
Comments in R start with the
#
symbol. Everything after#
on a line is ignored by R.R does not support multi-line comments like Python’s
"""
, but you can simulate them using multiple#
symbols:To comment a block of code quickly in RStudio, use:
Ctrl + Shift + C (Windows/Linux)
Cmd + Shift + C (Mac)