Useful Libraries
"dplyr"
To use this he data must be Tidy
The
dplyr
package provides a concise set of operations for managing data frames. With these functions we can do a number of complex operations in just a few lines of code. In particular, we can often conduct the beginnings of an exploratory analysis with the powerful combination of group_by()
and summarize()
.
One important contribution of the
dplyr
package is that it provides a “grammar” (in particular, verbs) for data manipulation and for operating on data frames.
PipeLine Operator - %>%
It can Perform
- Select
- Filter
- Sorting
- Rename
- Mutate
- Group_by
"ggplot"
*Scatter Plot* (geom point)
*Histogram* (geom_histogram)
*Density* (geom_density)
*Boxplots* (geom_boxplot)
line just remove geom point
Code
ggplot(data, aes(x=quantity, y=price)) + geom_point() + geom_smooth()
2
ggplot(data, aes(x=quantity, y=price, color = size, size = variable) + geom_point() + geom_smooth()
#geom_smooth(se=FALSE) - LINE WITHOUT ANY GREY COLOR
#geom_smooth() LINE WITH ANY GREY COLOR
#geom_smooth(se=FALSE, method='lm')
#We can apply facet option as above
#scale ="free_y" by facet the multiple histograms adjusts the scales *3
3
ggplot(data, aes(x=quantity)) + geom_histogram(binwidth=200) + facet_wrap(~ clarity, scale ="free_y")
Boxplots and Violin Plots
Boxplots( if data is normally distributed)
store the the data in the object
d = ggplot(data, aes(x=quantity, y=price)) + geom_boxplot()
now stored in the object called d.
from output window
#geom_smooth(se=FALSE, method='lm')
Faceting
*graph breaks into 10 different clarity values( say 10 values)
ggplot(data, aes(x=quantity, y=price, color = size, size = variable)) + geom_point() + facet_wrap(~ clarity)
ggplot(data, aes(x=quantity, y=price, color = size, size = variable)) + geom_point() + facet_grid(color~ clarity)
ADDING TITLE TO CHART
ggplot(data, aes(x=quantity, y=price, color = size, size = variable)) + geom_point() + ggtitle("my visual scatter plot") + xlab("quant")+xlim(0,2)
#ggtitle("my visual scatter plot") - to give chart name
# xlab("quant") - giving Desired X Axis Name
#xlim(0,2) - giving Desired XAxis range between 0,2
Histograms and Density Plots
1
ggplot(data, aes(x=quantity) )+ geom_histogram(binwidth=200)
* 2 gives color stack histogram
ggplot(data, aes(x=quantity, fill = clarity)) + geom_histogram
#scale ="free_y" by facet the multiple histograms adjusts the scales *3
3
ggplot(data, aes(x=quantity)) + geom_histogram(binwidth=200) + facet_wrap(~ clarity, scale ="free_y")
Density Plots
/Output Graph Black in color/
ggplot(data, aes(x=quantity) )+ geom_density()
/Output Graph will be in color/ ( Helpful in comparing multiple distributions)
ggplot(data, aes(x=quantity, color = size) )+ geom_density()
Boxplots and Violin Plots
Boxplots( if data is normally distributed)
ggplot(data, aes(x=quantity, y=price)) + geom_boxplot()
if more outliers then go log
ggplot(data, aes(x=quantity, y=price)) + geom_boxplot() + scale_y_log10()
Violin Plots (if data is not normally distributed)
if more outliers then go log
ggplot(data, aes(x=quantity, y=price)) + geom_boxplot() + scale_y_log10()
Violin Plots (if data is not normally distributed)
ggplot(data, aes(x=quantity, y=price)) + geom_violin() + scale_y_log10()
Saving the plots
d = ggplot(data, aes(x=quantity, y=price)) + geom_boxplot()
now stored in the object called d.
image format
ggsave(filename = "box1=jpeg", d)
ggsave(filename = "box1=png", d)
PDF Format
ggsave(filename = "box1=pdf", d)
or by graphical method
export
save as image or pdf (can select aspect rations for the image).
Comments
Post a Comment