本文介绍了构建一个能够使用PyParse解析不同日期格式的简单解析器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在构建一个简单的解析器,它会像下面这样查询:
'show fizi从1/1/2010到11/2/2006提交'
到目前为止我有:


I am building a simple parser that takes a query like the following:'show fizi commits from 1/1/2010 to 11/2/2006'So far I have:

class QueryParser(object):

def parser(self, stmnt):

    keywords = ["select", "from","to", "show","commits", "where", "group by", "order by", "and", "or"]
    [select, _from, _to, show, commits, where, groupby, orderby, _and, _or] = [ CaselessKeyword(word) for word in keywords ]

    user = Word(alphas+"."+alphas)
    user2 = Combine(user + "'s")

    startdate=self.getdate()
    enddate=self.getdate()

    bnf = (show|select)+(user|user2).setResultsName("user")+(commits).setResultsName("stats")\
    +Optional(_from+startdate.setResultsName("start")+_to+enddate.setResultsName("end"))

    a = bnf.parseString(stmnt)
    return a

def getdate(self):
    integer = Word(nums).setParseAction(lambda t: int(t[0]))
    date = Combine(integer('year') + '/' + integer('month') + '/' + integer('day'))
    #date.setParseAction(self.convertToDatetime)
    return date

我希望日期更加通用。意义用户可以提供2010年1月20日或其他日期格式。我发现一个很好的日期解析在线,这样做。它将日期作为字符串,然后解析它。所以我剩下的是将该函数的值从我的解析器中获取。我如何去标记和捕获两个日期字符串。现在,它只捕获格式'y / m / d'格式。有没有办法让整个字符串不知道它的格式如何。像关键字之后的东西像捕捉这个词。非常感谢任何帮助。

I would like the dates to be more generic. Meaning user can provide 20 Jan, 2010 or some other date format. I found a good date parsing online that does exactly that. It takes a date as a string and then parses it. So what I am left with is to feed that function the date string I get from my parser. How do I go about tokenizing and capturing the two date strings. For now it only captures the format 'y/m/d' format. Is there a way to just get the entire string regarless of how its formatted. Something like capture the word right after keywords and . Any help is greatly appreciated.

推荐答案

一个简单的方法是要求引用日期。一个粗略的例子是这样的,但是如果需要,你需要调整以适应当前的语法:

A simple approach is to require the date be quoted. A rough example is something like this, but you'll need to adjust to fit in with your current grammar if needs be:

from pyparsing import CaselessKeyword, quotedString, removeQuotes
from dateutil.parser import parse as parse_date

dp = (
    CaselessKeyword('from') + quotedString.setParseAction(removeQuotes)('from') +
    CaselessKeyword('to') + quotedString.setParseAction(removeQuotes)('to')
)

res = dp.parseString('from "jan 20" to "apr 5"')
from_date = parse_date(res['from'])
to_date = parse_date(res['to'])
# from_date, to_date == (datetime.datetime(2015, 1, 20, 0, 0), datetime.datetime(2015, 4, 5, 0, 0))

这篇关于构建一个能够使用PyParse解析不同日期格式的简单解析器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-30 06:02