Chapter 2 Objects and data types

Both Python and R are object-oriented high-level languages. The term “high-level” means that the programming language is written in some lower-level programming languages (e.g., C++), and thus are made to be more understandable by a human programmer (e.g., you don’t need to think about how much RAM you need to allocate to create and interact with an object – the language does it for you behind the scenes).

The term “object-oriented” means that the definition of objects is central in this programming. An object is any piece of data: from a single value to entire complex datasets. In object-oriented programming, objects are often assigned to certain classes of objects. Each object from a class have certain attribute and methods that can be applied to it.

For example, I have a succulent plant. I call it Steve. If it was an object in my computer, we could define the following:

  • class: plant

  • object: Steve

  • attributes: age, number of leaves, biomass, etc.

  • methods: water, plant in a larger pot, add nutrients.

Python and R have slightly different approaches to operating objects, but the logic is pretty similar. But before we get there, let’s talk about the simplest types of data that can be stored in an object.

2.1 Python

  • Boolean: the simplest logical data type, a single value out of two possible values; a flip of a coin would be a good example, and the possible values can be viewed as “heads” or “tails”, “yes” or “no”, “1” or “0”, “True” or “False”; in Python, the Boolean values are denoted as True or False;

  • Numeric: this would be a number, and there are actually different types of a number:

    • Integer: this is any real whole number: 1, 5000, -10000000, etc.

    • Float: any non-integer number, e.g., 1.235,

    • Complex number: you wouldn’t use it unless you are deep into math

  • Sequence: any ordered sequence of values

    • String: a sequence of characters, in Python defined with single or double quotation marks, e.g., 'Hello world' or "Hello world"; the length of a string can be any positive integer, from a single character to a whole paragraph of text

    • List: an ordered list of objects of any type denoted with square brackets, e.g., [1, True, 0.25, "apple"]; lists can be modified after they are created; a list element can also be a list, e.g., [[1, True, 0.25, "apple"], "second element here"]

    • Tuple: pretty much the same as a list, but cannot be modified (is immutable); a tuple is defined with parentheses, e.g., (1, True, 0.25)

  • Dictionary: a set of key-value pairs like you would have in a dictionary, e.g., key “one”-value “uno”, defined with curly brackets as {"key 1": "value 1", "key 2": "value 2"}

  • Set: an unordered set of unique values defined with a function set()

2.2 R

  • Logical: same as Boolean in Python, defined as TRUE or FALSE that also can be abbreviated as T or F

  • Numeric: any non-complex number;

  • Character: same as string in Python, e.g., "Hello world"

Some more complex data structures in R include vectors, data.frames, factors, which will be discussed later.