本文介绍了理解矢量化的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧! 问题描述 29岁程序员,3月因学历无情被辞! 我正在寻找一种将 R 中的大数字格式化为 2.3K 或 5.6M 。我在SO上找到了这个解决方案。原来,它显示了一些奇怪的行为,一些输入向量。 这是我想了解的 - $ / b> 0,6270.87962956305,383.337515655172,402.20778095643,19466.0204345063,$ b)pre $ #具有奇怪行为的测试向量x< $ b 1779.05474064539,1467.09928489114,3786.27112222457,2080.08078309959, 51114.7097545816,51188.7710104291,59713.9414049798) #格式化大数 comprss函数< - function(tx){ div c(1,1e3,1e6,1e9,1e12)) paste round(as.numeric(gsub(\\,,tx))/ 10 ^(3 *(div-1)),1),c('','K' 'b','T')[div],sep ='')} #比较以下三个命令的输出x comprss(x) sapply(x,comprss) 我们可以看到 comprss(x)产生了 0k 作为5 th 元素是奇怪的,但是 comprss(x [5])给了我们预期的结果。据我所知,在的主体中使用的所有函数, comprss 被矢量化。那么为什么我仍然需要 sapply 我的出路呢? 解决方案这是一个从 pryr ::: print.bytes 改编的矢量化版本: grouping paste0(signif(x /(1000 ^ grouping),digits = digits),c('','K','M','B','T' )[grouping + 1])$ ​​b $ b} format_for_humans(10 ^ seq(0,12,2))#> [1]110010K1M100M10B1T x 0 ,6270.87962956305,383.337515655172,402.20778095643,19466.0204345063, 1779.05474064539,1467.09928489114,3786.27112222457,2080.08078309959, 51114.7097545816,51188.7710104291,59713.9414049798) format_for_humans(x)#> [1]30232.6K3.32K12.1K06.27K383402#> [9]19.5K1.78K1.47K3.79K2.08K51.1K51.2K59.7K format_for_humans(x,digits = 1 )#> [1]30030K3K10K06K40040020K2K1K#> [12]4K2K50K50K60K I was looking for a way to format large numbers in R as 2.3K or 5.6M. I found this solution on SO. Turns out, it shows some strange behaviour for some input vectors.Here is what I am trying to understand - # Test vector with weird behaviourx <- c(302.456500093388, 32553.3619756151, 3323.71232001074, 12065.4076372462, 0, 6270.87962956305, 383.337515655172, 402.20778095643, 19466.0204345063, 1779.05474064539, 1467.09928489114, 3786.27112222457, 2080.08078309959, 51114.7097545816, 51188.7710104291, 59713.9414049798)# Formatting function for large numberscomprss <- function(tx) { div <- findInterval(as.numeric(gsub("\\,", "", tx)), c(1, 1e3, 1e6, 1e9, 1e12) ) paste(round( as.numeric(gsub("\\,","",tx))/10^(3*(div-1)), 1), c('','K','M','B','T')[div], sep = '')}# Compare outputs for the following three commandsxcomprss(x)sapply(x, comprss)We can see that comprss(x) produces 0k as the 5th element which is weird, but comprss(x[5]) gives us the expected results. The 6th element is even weirder.As far as I know, all the functions used in the body of comprss are vectorised. Then why do I still need to sapply my way out of this? 解决方案 Here's a vectorized version adapted from pryr:::print.bytes:format_for_humans <- function(x, digits = 3){ grouping <- pmax(floor(log(abs(x), 1000)), 0) paste0(signif(x / (1000 ^ grouping), digits = digits), c('', 'K', 'M', 'B', 'T')[grouping + 1])}format_for_humans(10 ^ seq(0, 12, 2))#> [1] "1" "100" "10K" "1M" "100M" "10B" "1T"x <- c(302.456500093388, 32553.3619756151, 3323.71232001074, 12065.4076372462, 0, 6270.87962956305, 383.337515655172, 402.20778095643, 19466.0204345063, 1779.05474064539, 1467.09928489114, 3786.27112222457, 2080.08078309959, 51114.7097545816, 51188.7710104291, 59713.9414049798)format_for_humans(x)#> [1] "302" "32.6K" "3.32K" "12.1K" "0" "6.27K" "383" "402"#> [9] "19.5K" "1.78K" "1.47K" "3.79K" "2.08K" "51.1K" "51.2K" "59.7K"format_for_humans(x, digits = 1)#> [1] "300" "30K" "3K" "10K" "0" "6K" "400" "400" "20K" "2K" "1K"#> [12] "4K" "2K" "50K" "50K" "60K" 这篇关于理解矢量化的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持! 阿里云证书,YYDS!
05-21 09:42