本文介绍了使用ggridges可视化泊松随机样本组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两组数据,全部在一个数据帧中.第一组与位置1中收集的数据相关,第二组与位置2中收集的数据有关.每个位置在5个月内具有不同的计数数据(列value).

I have two sets of data, all in one data frame. The first set is related to data collected in Location 1 and the second set is collected in Location 2. Each location has different count data (column value) for 5 months.

# DataSet
-----------------
rp_data <-    structure(list(Month = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L,
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("1",
"2", "3", "4", "5"), class = "factor"), location = c("1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1", "1",
"1", "1", "1", "1", "1", "1", "1", "1", "1", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2", "2",
"2", "2", "2", "2", "2", "2", "2"), value = c(0L, 1L, 1L, 1L,
2L, 1L, 0L, 0L, 1L, 1L, 3L, 2L, 1L, 4L, 1L, 3L, 1L, 1L, 1L, 1L,
2L, 2L, 1L, 0L, 2L, 4L, 3L, 5L, 5L, 0L, 4L, 3L, 3L, 4L, 2L, 5L,
2L, 3L, 10L, 6L, 5L, 6L, 4L, 6L, 4L, 5L, 6L, 5L, 3L, 7L, 1L,
1L, 1L, 1L, 0L, 0L, 2L, 1L, 2L, 0L, 2L, 3L, 4L, 1L, 2L, 1L, 2L,
0L, 2L, 2L, 4L, 4L, 5L, 1L, 4L, 5L, 4L, 5L, 1L, 4L, 3L, 7L, 7L,
4L, 2L, 5L, 4L, 1L, 5L, 3L, 7L, 3L, 4L, 8L, 5L, 7L, 1L, 1L, 6L,
3L)), .Names = c("Month", "location", "value"), row.names = c(NA,
-100L), class = "data.frame")

我在下面使用了此示例,如 ggridges所示示例网页,以显示不同月份的各种计数值.

I used this example below, as illustrated on the ggridges examples webpage, to display the various count values across different months.

# Plot 1 , filtering data related to location = 1
#---------------

ggplot(rp_data[rp_data$location == '1',], aes(x = value, y = Month, group = Month)) +
  geom_density_ridges2(aes(fill = Month), stat = "binline", binwidth = 1, scale = 0.95) +
  geom_text(stat = "bin",
            aes(y = group + 0.95*(..count../max(..count..)),
                label = ifelse(..count..>0, ..count.., "")),
            vjust = 1.4, size = 3, color = "white", binwidth = 1) +
  scale_x_continuous(breaks = c(0:12), limits = c(-.5, 13), expand = c(0, 0),
                     name = "random value") +
  scale_y_discrete(expand = c(0.01, 0), name = "Month",
                   labels = c("5.0", "4.0", "3.0", "2.0", "1.0")) +
  scale_fill_cyclical(values = c("#0000B0", "#7070D0")) +
  labs(title = "Poisson random samples location 1 different Month",
       subtitle = "sample size n=10") +
  guides(y = "none") +
  theme_ridges(grid = FALSE) +
  theme(axis.title.x = element_text(hjust = 0.5),
        axis.title.y = element_text(hjust = 0.5))

# Plot 2 , filtering data related to location = 2
#---------------

ggplot(rp_data[rp_data$location == '2',], aes(x = value, y = Month, group = Month)) +
  geom_density_ridges2(aes(fill = Month), stat = "binline", binwidth = 1, scale = 0.95) +
  geom_text(stat = "bin",
            aes(y = group + 0.95*(..count../max(..count..)),
                label = ifelse(..count..>0, ..count.., "")),
            vjust = 1.4, size = 3, color = "white", binwidth = 1) +
  scale_x_continuous(breaks = c(0:12), limits = c(-.5, 13), expand = c(0, 0),
                     name = "random value") +
  scale_y_discrete(expand = c(0.01, 0), name = "Month",
                   labels = c("5.0", "4.0", "3.0", "2.0", "1.0")) +
  scale_fill_cyclical(values = c("#0000B0", "#7070D0")) +
  labs(title = "Poisson random samples location 2 different Month",
       subtitle = "sample size n=10") +
  guides(y = "none") +
  theme_ridges(grid = FALSE) +
  theme(axis.title.x = element_text(hjust = 0.5),
        axis.title.y = element_text(hjust = 0.5))

情节1的结果

我的问题是如何合并这两个图,有点像叠加图,如例子:

My question is how can I combine these two plots, sort of like an overlay plot as shown in this example:

我不想在两个单独的图中绘制它们.

I don't want to plot them in two separate plots.

推荐答案

您需要创建一个包含Monthlocation的分组变量.您可以使用paste0(Month, location)来做到这一点.现在,我将省略文本标签,尽管稍加思考也可能使它们成为可能. (但我认为他们会让这个数字太忙了.)

You need to create a grouping variable that contains both Month and location. You can do that by using paste0(Month, location). For now, I'm leaving out the text labels, though they may be possible with a little more thought as well. (But I think they'd make the figure too busy.)

ggplot(rp_data,
       aes(x = value, y = Month,
           group = paste0(Month, location),
           fill = paste0(Month, location))) +
  geom_density_ridges2(stat = "binline", binwidth = 1,
                       scale = 0.95, alpha = 0.7) +
  scale_x_continuous(breaks = c(0:12), limits = c(-.5, 13),
                     expand = c(0, 0), name = "random value") +
  scale_y_discrete(expand = c(0.01, 0), name = "Month",
                   labels = c("5.0", "4.0", "3.0", "2.0", "1.0")) +
  scale_fill_cyclical(values = c("#0000B0", "#B00000",
                                 "#7070D0", "#FC5E5E")) +
  labs(title = "Poisson random samples location 1 different Month",
       subtitle = "sample size n=10") +
  guides(y = "none") +
  theme_ridges(grid = FALSE, center = TRUE)

现在带有文本标签.

ggplot(rp_data, aes(x = value, y = Month, group = paste0(Month, location), fill = paste0(Month, location))) +
  geom_density_ridges2(stat = "binline", binwidth = 1, scale = 0.95, alpha = 0.7) +
  geom_text(stat = "bin",
            aes(y = ceiling(group/2) + 0.95*(..count../max(..count..)),
                label = ifelse(..count..>0, ..count.., ""), color = location),
            vjust = 1.4, size = 3, binwidth = 1, fontface = "bold") +
  scale_x_continuous(breaks = c(0:12), limits = c(-.5, 13), expand = c(0, 0),
                     name = "random value") +
  scale_y_discrete(expand = c(0.01, 0), name = "Month",
                   labels = c("5.0", "4.0", "3.0", "2.0", "1.0")) +
  scale_fill_cyclical(values = c("#0000B0", "#B00000", "#7070D0", "#FC5E5E")) +
  scale_color_cyclical(values = c("white", "black")) +
  labs(title = "Poisson random samples location 1 different Month",
       subtitle = "sample size n=10") +
  guides(y = "none") +
  theme_ridges(grid = FALSE, center = TRUE)

再次,不确定这是一个好主意,但是您就可以了.

Again, not sure it's a good idea, but there you go.

这篇关于使用ggridges可视化泊松随机样本组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

08-14 00:57