本文介绍了isnumeric() 与 PostgreSQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要确定给定的字符串是否可以在 SQL 语句中解释为数字(整数或浮点数).如下:

I need to determine whether a given string can be interpreted as a number (integer or floating point) in an SQL statement. As in the following:

SELECT AVG(CASE WHEN x ~ '^[0-9]*.?[0-9]*$' THEN x::float ELSE NULL END) FROM test

我发现可以使用 Postgres 的 模式匹配为了这.所以我改编了这个地方 合并浮点数.这是我的代码:

I found that Postgres' pattern matching could be used for this. And so I adapted the statement given in this place to incorporate floating point numbers. This is my code:

WITH test(x) AS (
    VALUES (''), ('.'), ('.0'), ('0.'), ('0'), ('1'), ('123'),
    ('123.456'), ('abc'), ('1..2'), ('1.2.3.4'))

SELECT x
     , x ~ '^[0-9]*.?[0-9]*$' AS isnumeric
FROM test;

输出:

    x    | isnumeric 
---------+-----------
         | t
 .       | t
 .0      | t
 0.      | t
 0       | t
 1       | t
 123     | t
 123.456 | t
 abc     | f
 1..2    | f
 1.2.3.4 | f
(11 rows)

如您所见,前两项(空字符串 '' 和唯一句点 '.')被错误分类为数字类型(它们不是).我现在无法接近这一点.任何帮助表示赞赏!

As you can see, the first two items (the empty string '' and the sole period '.') are misclassified as being a numeric type (which they are not). I can't get any closer to this at the moment. Any help appreciated!

更新基于这个答案(及其评论),我调整了模式以:

Update Based on this answer (and its comments), I adapted the pattern to:

WITH test(x) AS (
    VALUES (''), ('.'), ('.0'), ('0.'), ('0'), ('1'), ('123'),
    ('123.456'), ('abc'), ('1..2'), ('1.2.3.4'), ('1x234'), ('1.234e-5'))

SELECT x
     , x ~ '^([0-9]+[.]?[0-9]*|[.][0-9]+)$' AS isnumeric
FROM test;

给出:

     x    | isnumeric 
----------+-----------
          | f
 .        | f
 .0       | t
 0.       | t
 0        | t
 1        | t
 123      | t
 123.456  | t
 abc      | f
 1..2     | f
 1.2.3.4  | f
 1x234    | f
 1.234e-5 | f
(13 rows)

正如我现在看到的,科学记数法和负数仍然存在一些问题.

There are still some issues with the scientific notation and with negative numbers, as I see now.

推荐答案

您可能已经注意到,基于正则表达式的方法几乎不可能正确执行.例如,您的测试表明 1.234e-5 不是有效数字,但实际上确实如此.此外,您错过了负数.如果某些东西看起来像一个数字,但是当您尝试存储它时会导致溢出怎么办?

As you may noticed, regex-based method is almost impossible to do correctly. For example, your test says that 1.234e-5 is not valid number, when it really is. Also, you missed negative numbers. What if something looks like a number, but when you try to store it it will cause overflow?

相反,我建议创建尝试实际转换为 NUMERIC(或 FLOAT,如果您的任务需要它)并返回 TRUE 的函数> 或 FALSE 取决于此转换是否成功.

Instead, I would recommend to create function that tries to actually cast to NUMERIC (or FLOAT if your task requires it) and returns TRUE or FALSE depending on whether this cast was successful or not.

这段代码将完全模拟函数ISNUMERIC():

This code will fully simulate function ISNUMERIC():

CREATE OR REPLACE FUNCTION isnumeric(text) RETURNS BOOLEAN AS $$
DECLARE x NUMERIC;
BEGIN
    x = $1::NUMERIC;
    RETURN TRUE;
EXCEPTION WHEN others THEN
    RETURN FALSE;
END;
$$
STRICT
LANGUAGE plpgsql IMMUTABLE;

对您的数据调用此函数会得到以下结果:

Calling this function on your data gets following results:

WITH test(x) AS ( VALUES (''), ('.'), ('.0'), ('0.'), ('0'), ('1'), ('123'),
  ('123.456'), ('abc'), ('1..2'), ('1.2.3.4'), ('1x234'), ('1.234e-5'))
SELECT x, isnumeric(x) FROM test;

    x     | isnumeric
----------+-----------
          | f
 .        | f
 .0       | t
 0.       | t
 0        | t
 1        | t
 123      | t
 123.456  | t
 abc      | f
 1..2     | f
 1.2.3.4  | f
 1x234    | f
 1.234e-5 | t
 (13 rows)

它不仅更正确、更容易阅读,而且如果数据实际上是一个数字,它的工作速度也会更快.

Not only it is more correct and easier to read, it will also work faster if data was actually a number.

这篇关于isnumeric() 与 PostgreSQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-14 04:49