Sunday, April 05, 2020

R : A crash course for programmers - Chapter01

It was an idea for quite some time to write a blog on R, for the benefit of programmers. Basically Sr Managers in the organizations who were programmers or have good coding experience some time in their life.

How can R be used by them easily. The next series of blogs will be on R.

The contents are available in a book form, leave your email in comments and I will mail the book to you. (PDF)
=========================================================================

Chapter 1 Vectors


2.1 Introduction


Unlike in other programming languages in R the following are the differences.

a.    We don’t use the word variables, rather we use the word vectors. A vector is a data structure.

b.    In other programming languages for instances we have numbers (int and floats) and arrays etc. Where as in R everything is an array. So a number is essentially an array with a single element.

Let us take some examples.

> a<-10 span="">
> b<-c span="">
> a
[1] 10
> b
[1] 10
> class(a)
[1] "numeric"
> class(b)
[1] "numeric"

Vector a is defined as a number, where as b is defined as an array. But both are treated as an array of one number.
The class() function determines type of the array.
<- a="" and="" assign="" hyphen.="" is="" less="" operator="" span="" than="" the="" to="" value="">
> a[1]
[1] 10
> b[1]
[1] 10

As a proof that both are the same description we can try to print the first element in the array. Note a also behaves like an array because it is one.








2.2 Operations in R.




R is rich in operators. Operators can evolve to a numeric value or to a logical value. Just as I told earlier all variables are essentially arrays, so a operator works on an array jus like a number.

Look at the following examples.

> a<-10 b="" c="" span="">
> a;b
[1] 10
[1] 10 20 30
> a+10;a*10;a^2;a%%3
[1] 20
[1] 100
[1] 100
[1] 1
> b+10;b*10;b^^2;b%%3
[1] 20 30 40
[1] 100 200 300
> b+10;b*10;b^2;b%%3
[1] 20 30 40
[1] 100 200 300
[1] 100 400 900
[1] 1 2 0

The ; can be used to separate multiple commands to be issued in the same row.
The arithmetic operators +, - * , / are self explanatory. ^ is for the power of.
a^2 here means a2

%% is the modulus operator, it gives the remainder. 
> a>10;b>10
[1] FALSE
[1] FALSE  TRUE  TRUE

Logical operators are supported as well. >, >=, <. <=, !=.
The logical operators return a logical output.






2.3 Name in R.




The elements of a vector are accessed by using integers. The first item is at index 1. R is more interesting in one aspect and that is naming. You can name a vector’s columns. And then you can use those names for accessing the items. Look at the following examples to understand more. Remember any line that begins with a # is treated as a comment.

> sales<-c span="">
> sales
[1] 120 190  90
> sales[1]
[1] 120
> sales[3]
[1] 90

Here the sales vector is defined with three numbers. You can check by entering class(sales) and you get an output NUMERIC.
The elements are accessed using numbers 1,2… etc.
> names(sales)<-c an="" ar="" eb="" span="">
> sales
Jan Feb Mar
120 190  90

The names function gives names to the columns. Since we have three columns in the sales vector, we give three names. (Jan, Feb and Mar).
> max(sales)
[1] 190
>
> sales==max(sales)
  Jan   Feb   Mar
FALSE  TRUE FALSE

The max is an inbuilt function that goes thru the vector (sales) and lists the column that has the max value.
When you list the sales vector with sales==max(sales), it gives us a logical vector with TRUE/FALSE values.
> sales[max.sales]
Feb
190
> class(sales)
[1] "numeric"
> class(max.sales)
[1] "logical"

Here we create a new logical vector called max.sales. Remember that vector names can have . or _ or lower/upper case characters.
The sales[max.sales] is used to display that column that has the TRUE value.





That brings an end to chapter 1.

The next page has some exercises for you to try.




2.4 Exercises




1.     Download the R. Goto the following web site to download R.


b.   

c.     Create a folder called c:\R

d.     Install R in this folder.

e.     You don’t need anything else for the time being, There are some good IDE for you to work, one of them is R Studio. You can download it from https://rstudio.com/products/rstudio/download/

f.     Choose the R studio desktop free version and install it in the c:\R folder

2.    Run the R.exe from the C:\R\R-3.6.3\bin, if you installed it in c:\R

a.    You can do all the remaining questions from here. If you use R studio then you can do the exercises from here.

b.   

c.     Increase the size of “Console” window and you can do the exercises from here. If you run R.exe then the screen looks like this

d.    

e.     The back ground is black and fonts are small by default. Right click on the title bar, goto to properties

f.    

g.    You can set the colors and fonts from here.

3.   Questions that you need to do.

a.    You can run the sample code that I used during the chapter to test if you see the same output as the one I have shown.

b.    Declare a numeric vector called train.late with the following (12, 16, 14, 21, 3, 19,9)

c.     Give a heading from Mon to Sun.

d.     Assuming that the numbers are in minutes, find the following

                                          i.    On which day was the train min late

                                         ii.    On which day was the train max late.

                                        iii.    What is the average late in minutes for the train.

                                       iv.    On which days was the train more than 10 minutes late

e.     Find the value of the following equations

                                          i.    32+64

                                         ii.    32+25*4

                                        iii.    32-(25+16)*12

                                       iv.    Remainder of 100 divided by 9.

f.     Create a nums.list vector with the following numbers 1,2,3,4

g.    Create a nums.list2 with the following numbers 2,4,6,8

h.    Find the following

                                          i.    Nums.list + nums.list2

                                         ii.    Nums.listnums.list2

                                        iii.    Nums.list2nums.list


4.  You define x<- span="">

a.    What is the output of sort(x)?

b.    What if you want the output in reverse order (max to min)?