R Code

Posts

Showing posts from January, 2018

How to edit data manually?

January 16, 2018

say working on mtcars data Code: mtcars< - edit(mtcars) this will open up the table, where we can edit the values manually and save the file.

Exploratory Data Analysis EDA is an attitude to analysing data sets to summarise their main characteristics, often with visual methods . Exploratory data analysis was promoted by John Tukey to encourage statisticians to explore the data, and possibly formulate hypotheses that could lead to new data collection and experiments. Which focuses more narrowly on checking assumptions required for model fitting and hypothesis testing, and handling missing values and making transformations of variables as needed. The purpose of exploratory data analysis is to: Check for missing data and other mistakes. Gain maximum insight into the data set and its underlying structure. Uncover a parsimonious model , one which explains the data with a minimum number of predictor variables . Check assumptions associated with any model fitting or hypothesis test . Create a list of outliers or other anomalies. Find par...

View Data from frames

January 08, 2018

Say data stored in the name of v1 Want to see first 6 code: head(v1) want to see first 10 code: head(v1,n=10) or head(v1, 10) Want to see last 6 code: tail(v1) want to see last 10 code: tail(v1,n=10) or tail(v1,10) Want Column Name colnames(v1) Want Row Name rownames(v1)

Read Data

January 06, 2018

Format Type CSV Format Code : read.table("c:/mydata.csv", header=TRUE, sep=",") read.csv(file = file.chose()) After Setting Working Directory read.table("file name") Other Formats SPSS, SAS, Stata & more use "readxl" package

Working Directory

January 06, 2018

Find out first where your working directory is set at this moment Code: getwd() Want to change the path Code: setwd("<location of your dataset>")

Useful Libraries

January 06, 2018

"dplyr" To use this he data must be Tidy The dplyr package provides a concise set of operations for managing data frames. With these functions we can do a number of complex operations in just a few lines of code. In particular, we can often conduct the beginnings of an exploratory analysis with the powerful combination of group_by() and summarize() . One important contribution of the dplyr package is that it provides a “grammar” (in particular, verbs) for data manipulation and for operating on data frames. PipeLine Operator - %>% It can Perform Select Filter Sorting Rename Mutate Group_by "ggplot" *Scatter Plot* (geom point) *Histogram* (geom_histogram) *Density* (geom_density) *Boxplots* ( geom_boxplot ) line just remove geom point Code 1 ggplot(data, aes(x=quantity, y=price)) + geom_point() + geom_smooth() 2 ggplot(data, aes(x=quantity, y=price, color = size, s...

Installing Packages

January 06, 2018

In R Console, install.packages("PackageName") Hit Enter, Packages will Install To Use library(PackageName) Hit Enter library will load, can be used as per required.