问题:

现在有一张用户登陆日志表,该表包括user_id,ip,log_time三个字段,现在需要找出共同使用ip数量超过3个(含)的所有用户对。比如下面的示例数据,101和102用户共同使用的ip为4个,101和103用户共同使用的ip为3个,102和103用户共同使用的ip为3个。

(101,'192.168.10.101','2022-05-10 11:00:00'),
(101,'192.168.10.101','2022-05-10 11:01:00'),
(101,'192.168.10.102','2022-05-10 11:02:00'),
(101,'192.168.10.103','2022-05-10 11:03:00'),
(101,'192.168.10.104','2022-05-10 11:04:00'),

(102,'192.168.10.101','2022-05-10 11:04:30'),
(102,'192.168.10.102','2022-05-10 11:05:00'),
(102,'192.168.10.103','2022-05-10 11:06:00'),
(102,'192.168.10.104','2022-05-10 11:07:00'),

(103,'192.168.10.102','2022-05-10 11:08:00'),
(103,'192.168.10.103','2022-05-10 11:08:00'),
(103,'192.168.10.104','2022-05-10 11:10:00'),

(104,'192.168.10.103','2022-05-10 11:11:00'),
(104,'192.168.10.104','2022-05-10 11:12:00'),

(105,'192.168.10.105','2022-05-10 11:13:00')

SQL解答:

问题的关键点是使用自连接,先按用户和ip去重之后进行自关联。因为如果公共使用ip达到3个及以上的话,那么同一个用户对至少会出现3条数据,筛选一下就行。

with user_login as (
select 101 as user_id ,'192.168.10.101' as ip ,'2022-05-10 11:00:00' as log_time
union all                                                            
select 101 as user_id ,'192.168.10.101' as ip ,'2022-05-10 11:01:00' as log_time
union all                                                            
select 101 as user_id ,'192.168.10.102' as ip ,'2022-05-10 11:02:00' as log_time
union all                                                            
select 101 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:03:00' as log_time
union all                                                            
select 101 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:04:00' as log_time
union all                                                            
select 102 as user_id ,'192.168.10.101' as ip ,'2022-05-10 11:04:30' as log_time
union all                                                            
select 102 as user_id ,'192.168.10.102' as ip ,'2022-05-10 11:05:00' as log_time
union all                                                            
select 102 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:06:00' as log_time
union all                                                            
select 102 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:07:00' as log_time
union all                                                            
select 103 as user_id ,'192.168.10.102' as ip ,'2022-05-10 11:08:00' as log_time
union all                                                            
select 103 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:08:00' as log_time
union all                                                            
select 103 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:10:00' as log_time
union all                                                            
select 104 as user_id ,'192.168.10.103' as ip ,'2022-05-10 11:11:00' as log_time
union all                                                            
select 104 as user_id ,'192.168.10.104' as ip ,'2022-05-10 11:12:00' as log_time
union all                                                            
select 105 as user_id ,'192.168.10.105' as ip ,'2022-05-10 11:13:00' as log_time
),
tmp as
(
    select 
    user_id
    ,ip
    from user_login --实际换成自己的表或上面的样例数据
    group by user_id,ip  --同一个ip同一用户可能多次登录,先去重
)

select
t1.user_id
,t2.user_id
,count(t1.ip) as ip_cnt
from tmp t1
inner join tmp t2
on t1.ip=t2.ip  --通过ip自关联
where t1.user_id<t2.user_id  --因为存在101对102,102对101的情况,保留一种即可
group by t1.user_id,t2.user_id
having ip_cnt>=3  --保留用户对ip数量超过3个的(含)
12-26 05:16