Thursday, August 10, 2017

(R) Misc.

Before I begin writing entries pertaining to data modeling, there are a few final concepts that I would like to review. In this article, I will be discussing different methodologies that were not included in my previous entries. These concepts are nevertheless important, and should be mentioned before progressing on to more difficult tasks.

Saving Work

When exiting R-Studio, you will be presented with a few options pertaining to saving data form the current session. The first prompt will ask if you want to save your current R script file. This file has an .R extension, and contains the code that you created during your session.

Upon re-starting R Studio, you should notice that all of the objects and scripts that you were previously working on, have been loaded into the platform. This occurs due to the R workspace image file. R workspace files are automatically loaded into R Studio, and contain information pertaining to your prior session. This file can be located within your R working directory, and has the assigned name ".RData".

If you wanted to manually create an .RData file with a unique name from the R console, you could utilize the command:

save.image("<filename>.rdata") 


.Rhistory is another R-Studio file. This file contains the console log from the previous session. This file can be opened and viewed with WordPad or other text editing software.

To exit R from the command line, the following command can be used:

q()


Saving Data

If you would like to save one of the data sets that you have recently edited as an R Data Frame, this can be achieved with the following line of code:

save(<dataframename>, file=<filepathway>.rda”)

Example:

save(DataFrameA, file="C:\\Users
\\Desktop
\\DataFrameA.rda")

Subsequently, if you would like to re-load this data, the following code can be utilized:

load(“<filepathway>")

Example:

load("C:
\\Users\\Desktop\\DataFrameA.rda")

However, if you would prefer to have your data saved in a format that can be accessed by programs other than R, you may want to consider saving your data as either a comma separated value file, or as a tab delineated file. The code for accomplishing such is below:

# Saving a as a comma separated value file #

write.table(<Dataframename>, file = "<filepathway>.csv", sep = ",", col.names = NA, row.names = TRUE)

Example:

write.table(DataFrameA, file = "C:
\\Users\\Desktop\\DataFrameA.csv", sep = ",", col.names = NA, row.names = TRUE)

# Saving a as a tab delineated file #

write.table(<Dataframename>, file = "<filepathway>.tsv", sep="\t")

Example:

write.table(DataFrameA, file = "C:
\\Users\\Desktop\\DataFrameA.tsv", sep="\t")

The options: (col.names = NA, row.names = TRUE), prevents a common formatting error from occurring which causes column output to be mislabeled.


Installing Packages and Enabling Packages

If you wanted to download and install packages directly by using the command line interface, you could do so with the following code:

install.packages("<packagename>")

If you would like to use an auxiliary package within your code, you would first have to enable it during your current R session. This can be accomplished by running the code below:

library(<packagename>)


Clear 'R' Workspace

If you would prefer to have a clear workspace during your current R session, you may utilize the following code:

rm(list=ls(all=TRUE))

Clear ‘R’ Console Log

If you would like to clear the log of the ‘R’ console, and you are using a Windows PC, simply press the following keys simultaneously to do so:

Ctrl (+) L

Disable Scientific Notation in ‘R’ Console Log Output

If you would like to disable ‘R’ from outputting data which is scientifically notated, you may utilize the following code:

options(scipen = 999)

Set Zero Values to NA

NA values are not included in R calculations, therefore, it may be useful at times, to change 0 values to NA. This can be achieve with the code below:

<DataFrameName$Variable>[<DataFrameName$Variable> == 0] <- NA

Example:

BaseballPlayers$HR [BaseballPlayers$HR == 0] <- NA

The next article will begin a series of articles pertaining to data modeling within R. Please stay tuned to this blog, as I can promise you that the next batch of entries will be incredibly useful for your data endeavors.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.