本文介绍了结合全文和其他索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有全文索引和创建日期的索引.

我对日期的查询很快就返回了一个不错的,小的44条记录(在一秒钟之内):

> db.oneMillionDocumentsIndexed.count({created: {$lte: ISODate("2016-02-06T15:34:59.019Z")} })
44

但是,如果我将其与文本搜索结合使用,查询的速度将非常慢:

> db.oneMillionDocumentsIndexed.count({
                                created: {$lte: ISODate("2016-02-06T15:34:59.019Z")}, 
                                $text: { $search: "raven" } })

似乎同时使用了两个索引:

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test.oneMillionDocumentsIndexed",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [
                {
                    "created" : {
                        "$lte" : ISODate("2016-02-06T15:34:59.019Z")
                    }
                },
                {
                    "$text" : {
                        "$search" : "raven",
                        "$language" : ""
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "created" : {
                    "$lte" : ISODate("2016-02-06T15:34:59.019Z")
                }
            },
            "inputStage" : {
                "stage" : "TEXT",
                "indexPrefix" : {

                },
                "indexName" : "$**_text",
                "parsedTextQuery" : {

                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "Plod",
        "port" : 27017,
        "version" : "3.0.7",
        "gitVersion" : "6ce7cbe8c6b899552dadd907604559806aa2e9bd"
    },
    "ok" : 1
}    

创建的日期搜索是否会减少文档数量,从而加快查询速度?

尽管文件不是很小,但它们也不是很大.这是一个示例文档:

{
    "_id" : ObjectId("56b612a2b6c13d2bec221d22"),
    "created" : ISODate("2016-02-06T15:34:57.954Z"),
    "adoptability-integer" : 1885631649,
    "impoverisher-double" : 0.78982932576436,
    "auriga-short-string" : "unpunished",
    "pistillate-long-string" : "raven nationalistic supergalaxies shit candidacy vengefulness baghla inharmony breviaries subcoracoid facet numbles Achaian hyksos g¥ᄀtterdï¿¥ï¾ exsecant costliness assertively cufic neurotomy subfebrile reassess eruption calciphobous epithecium adipopectic eruption neurotomy impaste shrugging oxytone depredating abb¥ᄑ unfaithfulness clive amman meteorology dollond del cussed malversation Determinateness wadset busher precedent warder lithest tuberculinize kythera swiping hyperopic installation otosclerosis costly joyance neenah saliently bicepses myograph blackmur. salable radiational copaiva seisure animism franglais chalkboard astride preaortic machinelike criseyde easternmost theological. goloshes amber assertively universalism pterylological abortifacient entrepï¾¢t nordic intricate canvasser unscholastic caria marginal prakritic gal tambur seascouting branchiform vaticide hysteroidal. vario chefoo permanganic solidillu lashings permanganic denatured chartres Nonenergetically pabx coinheritance koulibiaca wrathless unrejoicing kodly confutable Juru changelessness ratite pol lightener pansy portadown unpeg iontophoresis Ruddily overcorrupt rondure midair mobocrat. Rals sind teaser hussism definiteness piperidine septicity procryptic salicaceous catalpa Stingy panegyrise Baddie wodan preoccasioned ndebele sanitizing mulga grantedly selectman dep overscruple mealies subsellia noncompressible lepidoptera nonequilateral vï¿¥ï¾ racemiform carob preaccredit parramatta. piatigorsky unmanifest eulogized bolometric circumnavigating stare. prewitt branchiform canadianizing untinselled crossruf anthozoic del dragrope pronative foulness incessancy sultanate debunker guncotton reindictment uninstalled pieter buying prestwick anguish dicrotism permissible. nonscarcity labialising underswamp nondegradation incubating unwillable dealer Rewinded jaggedness jasmine flatfootedness edgily choregraphic unpenetrating unwhited devotedly thornton irremediably reentry cordilleras inhospitable blenchingly hedgehop. nontribesman semiexhibitionist streetlike outgeneral Spatiality hyacinthides prometheus tingly tenacious Aerologist promonarchy nonsophistical uhuru unsprayable countrywoman proequality schickard. antagonize Cart undocumented heteroplastic cyclostome keratin specification tombless lambie extricating feticide reacceded redwing autokinetic ferias underpart dupr¥ᄑ preexperimental besancon dvm riksm'' unharmonised bradykinetic unforeseeableness ryukyu rootstalk aquarial uredospore kame nondissenting pachyderm southeasterner comminute excitant torturing reasoningly restabilize isotopy emergency boathouses plowmanship decidedness skeptophylaxis kelebe clive furred abuttals variometer indamine wreathe. guymon rubinstein monotriglyph inaction. bedazzle foreordinated proportioner pursy beryl slogging forbearer abirritant concur. nonleprous veriax overservility mirza relitigate richness dipteroi mischarged. inquisitress nav unimpressibility teratoma brilliantined untensing vlaardingen theorbo shostakovich appia maximally fingered ashkenazim soap unpick isocheimenal gingili synonymical interannular patronising knaggiest cleaver lassie interwound osculated unobliging portobello boxer impactive.Bladderwort wish aerothermodynamics lymphadenomata nonfundamental interdiffuse injector chaussure. polyphyletically irishising ayous sinecurist decant carbonized flickeringly stomatitic emily luteotrophin anginous Syllabic permeameter Carthal brachiator farinose justicelike azotized getaway electroencephalographically puglia unconfound appendiceal premedical vassal rubric overhearing Conative heartaching shammer staphylorrhaphy bulgar spilikin phagocytosing adenitis syntypic dissertate collyrium sonless anoxia archil mimosis irreversibly unhabituated scholiast rcs portadown mishima preimport bonavist jointedly aspergillus farinose condemnation chough blanc descanter mephistopheles ongoing unsurgical unclassifiableness namtar corniest disbudding disklike zap wheyface teetotally nonsubmission delian enrober canadian nasi hypermetabolism animadversion Unbantering recompile ineradicable blindly mren Schorlaceous viperous latish unstationed decastylos catalpa beflagged pellicular demark gassendi. macmonnies deserve subsidizer generous reassess colorfully unsummonable clave hderlin borges aechmagoras misbegotten uncontradictory unfelicitous plunderage presynsacral backband amagasaki unsavorily proenzyme ney slipslop unrhythmical Debenture rosy unreprehended sulfuryl outpeep fichtean jellylike anginous foil pixies columella nonsuggestion unwhited icier archbishop masan oireachtas coxcomb pseudosiphonic rubinstein cockerel fidel swingle submembranous despondent sarajevo camshaft inclusiveness reynard deducibility Counselling velveteen whaleback interventricular harquebuses sodomite chunk nondecayed disyllable nonfundamental funnelling pricing neuroanatomical evaporate palisades kamerun. zigzag meteorology agura puerperium misfield annulus sapper franklinton prenotion pyroxylin dustour fluming cereus nontangental metempirical Nonadjudication restated impactive.Bladderwort swingle frolic hadramaut buraydah uncarbonized sthenius uncreditableness undreading grattoir excitant bma mellers centurial broad intellectualist pursy apodemal inclusiveness laurence kentucky cyanic nonunified jason. swiping mismatch cereuses dress entrain mannikin insetting scratchy glaiketness query antipatriarchal rjcharging. fichtean lwm reidentification theurgically Baddie abut snowcreep vaud cretheus clubhouses homodyne rayah beguine coquettishly rabidness retime lithoid send epistyle undefendable christless narcomania extraprofessional paracelsic interrogatories eucrite cotswolds reverberantly recommendatory dorsally wobbly sheared malacca worminess oka railway farnham bendwise prediet bastioned tuberculocele deriver intelligential Cutty Artillerist calipatria torchier drillable currawong obviable remoteness. forte sentimentalist dealer nonempathically foreseeable talthybius reinjuries tannic hyperopic toolmaker pieridine noncontention panne baghla syndromic intermeasuring gait leaving osteoclasis. squillageed cadetship messieurs benet Player terseness chagrinning sterically birthmark subvertebral runesmith stomodaeum illiberalising sarmentose overlubricate weeds ecumenic unretaliative execrableness trichotomic schumann luxury nupe dirk ashkenazim zap iconoclast vulneraries pulaski hypergeusesthesia mismatch lymphangitides cubitus unpossessable rummage silviculturally bara quo. arizonian danielson granddaddy klemperer curling derivatively monadal ungrained counterproductive contendingly handled aegirine motilal. unfaulty anecdotal cyanate. bucolically leaving mephistopheles revibrated maculation glairier palmer harebell Laryngectomized primitivity.Mucous consensual comfortlessly slumberland preenrollment decastylos buying yggdrasil unslakable concordia Uprooter pï¾¢rto meloid. klemperer frambesia bohemia kruller carburettor limousin Accessarily debt ameliorate bootblack richardson salvarsan contumaciousness landscaper epigyny palisades redwing pyribenzamine totalitarian taxiplane aurum chasm criollos transannular friendship spitefully eliot Ennomus spessartine pomiferous Ethnarchy milkiness fractable amigen unexuberant.Repark clapt keyboard noninjurious unemotive corbelled sib sextet beheader appear kyathos cirrocumular semipagan teasingly coelenteron nihilism chitchatted dress bateau unrhythmical unimmerged lapelled archaeocyte depersonalise redispersal querurying memlinc strepitous. consociate dehypnotizing stardom novelise mimosa disklike invertase nonmarriageable agreeing tuberculinize graphologic paris hew airy outwove inconvenienced columella desc freight broodiest spermatorrhoea melodic rebeck. silverised jahangir everard foolishly gabby packer Mahound Emendatory infeasibility inkpot resubstantiated Isopectic revivifying crassulaceous unresigned.Greenboard hanyang guevara inspectable hyperbrachycephaly dicrotal armipotent dissever girdlelike alternator obs. heritable nondietetically sensationism medick chlorellaceous spotted flews mariner gait nontribesman unshrinkability regulated haunter sharer postliminy maeterlinck disaffiliating nonreflection disadvantageously creepy congenitalness puglia savanna. Codetta orb reenlightenment gen palaeozoology educatee niobous deject dysteleological pampre electroencephalographically harebrained execrableness achroite theorbo germinance anisocarpic jagellon antlia frenchiness splendid communalize andalusia unlofty archduchy apery forbade snit wintriest mendicity franglais depersonalise sibship unslapped totalitarian compatriot doll polkaed dyersville huntingdonshire loftily spectrality carafe gouverneur cureless unprecarious redevelop illiberalism. racialistic distributing cameo madrigalesque coalitionist snort cochleate overact ladysmith protostele Afforestation multimegaton proletarianness Amphithalamus abeokuta. Amerind subfreezing missilery secateurs. superstructure. chrysarobin seaworthiness snohomish necrobacillosis incinerated wrack sclaffer kamasutra postmyxedemic mortgagor impaste earliness underlapped bucktooth mortified birthmark unscrupled angiocardiography hemiacetal judgeless hussy channing reunified nondissenting hypercathartic vindicable unslapped extensionally lashings canniest cling motional homotherm overobesity clive retasting clipt rewound. unousted prosper australorp theocracy Interprofessional crocus Carthal unmoveable repouss¥ᄑ birthmark reasonableness wristwatch patronising g¥ᄀtterdï¿¥ï¾ jink vitus. stokes ultima phyllocladium mudfish trust caravaggio overtipple Amorphous Baddie milksopping mulier. indeciduate winkle acrimoniousness. cereuses altgeld gelatinoid contemporaneous traveling haphtaroth aet gogglebox nupe archeptolemus withdrawable Nonenergetically horsing coral. stint preludin keynoter cogitative persuadability godwin wardenry reborn patternless sorrentine vitria moror chumash nonguilt nonpacific realter regive unoratorial halothane skeptophylaxis quo. songfest Desperado mischarged. suberise teratoma apposer homoiothermal nonstyptical "
}
解决方案

这里的主要情况是,文本"搜索结果通常优先于查询中的其他过滤条件,因此有必要首先"从文本"组件中获取结果,然后基本上扫描"中的其他条件.

与文本搜索结果结合使用时,很难与范围"或任何类型的不平等"匹配条件一起优化这种类型的搜索,这主要是由于MongoDB如何处理这种特殊"索引类型.

为简短演示,请考虑以下基本设置:

 db.texty.drop();

db.texty.insert([
    { "a": "a", "text": "something" },
    { "a": "b", "text": "something" },
    { "a": "b", "text": "nothing much" },
    { "a": "c", "text": "something" }
])

db.texty.createIndex({ "text": "text" })
db.texty.createIndex({ "a": 1 })
 

因此,如果您希望通过文本搜索条件以及其他字段({ "$lt": "c" })上的范围注意事项来查看它,则可以按以下方式进行处理:

 db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } }).explain()
 

带有解释输出,例如(重要部分):

            "winningPlan" : {
                    "stage" : "FETCH",
                    "filter" : {
                            "a" : {
                                    "$lt" : "c"
                            }
                    },
                    "inputStage" : {
                            "stage" : "TEXT",
                            "indexPrefix" : {

                            },
                            "indexName" : "text_text",
                            "parsedTextQuery" : {
                                    "terms" : [
                                            "someth"
                                    ],
                                    "negatedTerms" : [ ],
                                    "phrases" : [ ],
                                    "negatedPhrases" : [ ]
                            },
                            "inputStage" : {
                                    "stage" : "TEXT_MATCH",
                                    "inputStage" : {
                                            "stage" : "TEXT_OR",
                                            "inputStage" : {
                                                    "stage" : "IXSCAN",
                                                    "keyPattern" : {
                                                            "_fts" : "text",
                                                            "_ftsx" : 1
                                                    },
                                                    "indexName" : "text_text",
                                                    "isMultiKey" : true,
                                                    "isUnique" : false,
                                                    "isSparse" : false,
                                                    "isPartial" : false,
                                                    "indexVersion" : 1,
                                                    "direction" : "backward",
                                                    "indexBounds" : {

                                                    }
                                            }
                                    }
                            }
                    }
            },
 

基本上是说"首先给我文本结果,然后过滤通过其他条件获取的结果" .显然,这里仅使用文本"索引,然后通过检查内容对其返回的所有结果进行过滤.

这不是最佳的两个原因,可能是数据最好受范围"条件约束,而不是受文本搜索的匹配约束.其次,即使其他数据上有一个索引,这里也没有将其用于比较.因此,将为每个结果加载整个文档,并测试过滤器.

然后您可能会在这里考虑复合"索引格式,如果范围"更特定于选择,那么将其作为索引键的前缀顺序包括在内似乎是合乎逻辑的:

 db.texty.dropIndexes();
db.texty.createIndex({ "a": 1, "text": "text" })
 

但是这里有一个问题,因为当您尝试再次运行查询时:

 db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } })
 

这将导致错误:

因此,即使这看起来是最佳"的,MongoDB处理特殊文本"索引的查询(实际上是索引选择)的方式,也无法在范围之外进行此排除"

不过,您可以通过非常有效的方式对此进行平等"匹配:

 db.texty.find({ "a": "b", "$text": { "$search": "something" } }).explain()
 

带有说明输出:

            "winningPlan" : {
                    "stage" : "TEXT",
                    "indexPrefix" : {
                            "a" : "b"
                    },
                    "indexName" : "a_1_text_text",
                    "parsedTextQuery" : {
                            "terms" : [
                                    "someth"
                            ],
                            "negatedTerms" : [ ],
                            "phrases" : [ ],
                            "negatedPhrases" : [ ]
                    },
                    "inputStage" : {
                            "stage" : "TEXT_MATCH",
                            "inputStage" : {
                                    "stage" : "TEXT_OR",
                                    "inputStage" : {
                                            "stage" : "IXSCAN",
                                            "keyPattern" : {
                                                    "a" : 1,
                                                    "_fts" : "text",
                                                    "_ftsx" : 1
                                            },
                                            "indexName" : "a_1_text_text",
                                            "isMultiKey" : true,
                                            "isUnique" : false,
                                            "isSparse" : false,
                                            "isPartial" : false,
                                            "indexVersion" : 1,
                                            "direction" : "backward",
                                            "indexBounds" : {

                                            }
                                    }
                            }
                    }
            },
 

因此使用了索引,并且可以显示该索引以预过滤"通过其他条件的输出提供给文本匹配的内容.

如果确实将索引的前缀"保留为要搜索的文本"字段,则:

 db.texty.dropIndexes();

db.texty.createIndex({ "text": "text", "a": 1 })
 

然后执行搜索:

 db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } }).explain()
 

然后您会看到与上述平等"匹配类似的结果:

             "winningPlan" : {
                    "stage" : "TEXT",
                    "indexPrefix" : {

                    },
                    "indexName" : "text_text_a_1",
                    "parsedTextQuery" : {
                            "terms" : [
                                    "someth"
                            ],
                            "negatedTerms" : [ ],
                            "phrases" : [ ],
                            "negatedPhrases" : [ ]
                    },
                    "inputStage" : {
                            "stage" : "TEXT_MATCH",
                            "inputStage" : {
                                    "stage" : "TEXT_OR",
                                    "filter" : {
                                            "a" : {
                                                    "$lt" : "c"
                                            }
                                    },
                                    "inputStage" : {
                                            "stage" : "IXSCAN",
                                            "keyPattern" : {
                                                    "_fts" : "text",
                                                    "_ftsx" : 1,
                                                    "a" : 1
                                            },
                                            "indexName" : "text_text_a_1",
                                            "isMultiKey" : true,
                                            "isUnique" : false,
                                            "isSparse" : false,
                                            "isPartial" : false,
                                            "indexVersion" : 1,
                                            "direction" : "backward",
                                            "indexBounds" : {

                                            }
                                    }
                            }
                    }
            },
 

与第一次尝试的最大区别是将filter放置在处理链中,这表明虽然不是前缀"匹配(最佳),但实际上是从索引中扫描内容"之前"被发送到文本"阶段.

因此它是预过滤"的,但是当然不是以最佳方式进行的,这是由于使用文本"索引的本质所致.因此,如果您仅考虑索引本身的简单范围:

 db.texty.createIndex({ "a": 1 })
db.texty.find({ "a": { "$lt": "c" } }).explain()
 

然后是说明输出:

             "winningPlan" : {
                    "stage" : "FETCH",
                    "inputStage" : {
                            "stage" : "IXSCAN",
                            "keyPattern" : {
                                    "a" : 1
                            },
                            "indexName" : "a_1",
                            "isMultiKey" : false,
                            "isUnique" : false,
                            "isSparse" : false,
                            "isPartial" : false,
                            "indexVersion" : 1,
                            "direction" : "forward",
                            "indexBounds" : {
                                    "a" : [
                                            "[\"\", \"c\")"
                                    ]
                            }
                    }
            },
 

然后至少要考虑indexBounds并仅查看索引范围内的那部分.

这就是这里的区别.使用复合"结构可以缩小选择范围,从而在这里节省了一些迭代周期,但是它仍然必须扫描所有索引条目以进行过滤,并且当然必须作为前缀"索引中的元素,除非可以对其使用相等匹配.

在索引中没有复合结构的情况下,您始终会首先"返回文本结果,然后对这些结果应用任何其他条件.而且由于查询引擎的处理,不可能通过查看文本"索引和普通"索引来组合/相交"结果.通常,这并不是最佳方法,因此为考虑因素进行计划很重要.

简而言之,理想情况下,带有等号"的复合词应匹配前缀",如果没有,则在索引之后"中包含文本定义.

I have a full text index and an index on a created date.

My query on the date alone returns a nice, small 44 records very quickly (within a second):

> db.oneMillionDocumentsIndexed.count({created: {$lte: ISODate("2016-02-06T15:34:59.019Z")} })
44

However, if I combine this with the text search the query is incredibly slow:

> db.oneMillionDocumentsIndexed.count({
                                created: {$lte: ISODate("2016-02-06T15:34:59.019Z")}, 
                                $text: { $search: "raven" } })

It appears to use both indexes:

{
    "queryPlanner" : {
        "plannerVersion" : 1,
        "namespace" : "test.oneMillionDocumentsIndexed",
        "indexFilterSet" : false,
        "parsedQuery" : {
            "$and" : [
                {
                    "created" : {
                        "$lte" : ISODate("2016-02-06T15:34:59.019Z")
                    }
                },
                {
                    "$text" : {
                        "$search" : "raven",
                        "$language" : ""
                    }
                }
            ]
        },
        "winningPlan" : {
            "stage" : "FETCH",
            "filter" : {
                "created" : {
                    "$lte" : ISODate("2016-02-06T15:34:59.019Z")
                }
            },
            "inputStage" : {
                "stage" : "TEXT",
                "indexPrefix" : {

                },
                "indexName" : "$**_text",
                "parsedTextQuery" : {

                }
            }
        },
        "rejectedPlans" : [ ]
    },
    "serverInfo" : {
        "host" : "Plod",
        "port" : 27017,
        "version" : "3.0.7",
        "gitVersion" : "6ce7cbe8c6b899552dadd907604559806aa2e9bd"
    },
    "ok" : 1
}    

Shouldn't the created date search reduce the number of documents, so speeding up the query?

Whilst the documents aren't tiny, they aren't massive either. Here's an example document:

{
    "_id" : ObjectId("56b612a2b6c13d2bec221d22"),
    "created" : ISODate("2016-02-06T15:34:57.954Z"),
    "adoptability-integer" : 1885631649,
    "impoverisher-double" : 0.78982932576436,
    "auriga-short-string" : "unpunished",
    "pistillate-long-string" : "raven nationalistic supergalaxies shit candidacy vengefulness baghla inharmony breviaries subcoracoid facet numbles Achaian hyksos g¥ᄀtterdï¿¥ï¾ exsecant costliness assertively cufic neurotomy subfebrile reassess eruption calciphobous epithecium adipopectic eruption neurotomy impaste shrugging oxytone depredating abb¥ᄑ unfaithfulness clive amman meteorology dollond del cussed malversation Determinateness wadset busher precedent warder lithest tuberculinize kythera swiping hyperopic installation otosclerosis costly joyance neenah saliently bicepses myograph blackmur. salable radiational copaiva seisure animism franglais chalkboard astride preaortic machinelike criseyde easternmost theological. goloshes amber assertively universalism pterylological abortifacient entrepï¾¢t nordic intricate canvasser unscholastic caria marginal prakritic gal tambur seascouting branchiform vaticide hysteroidal. vario chefoo permanganic solidillu lashings permanganic denatured chartres Nonenergetically pabx coinheritance koulibiaca wrathless unrejoicing kodly confutable Juru changelessness ratite pol lightener pansy portadown unpeg iontophoresis Ruddily overcorrupt rondure midair mobocrat. Rals sind teaser hussism definiteness piperidine septicity procryptic salicaceous catalpa Stingy panegyrise Baddie wodan preoccasioned ndebele sanitizing mulga grantedly selectman dep overscruple mealies subsellia noncompressible lepidoptera nonequilateral vï¿¥ï¾ racemiform carob preaccredit parramatta. piatigorsky unmanifest eulogized bolometric circumnavigating stare. prewitt branchiform canadianizing untinselled crossruf anthozoic del dragrope pronative foulness incessancy sultanate debunker guncotton reindictment uninstalled pieter buying prestwick anguish dicrotism permissible. nonscarcity labialising underswamp nondegradation incubating unwillable dealer Rewinded jaggedness jasmine flatfootedness edgily choregraphic unpenetrating unwhited devotedly thornton irremediably reentry cordilleras inhospitable blenchingly hedgehop. nontribesman semiexhibitionist streetlike outgeneral Spatiality hyacinthides prometheus tingly tenacious Aerologist promonarchy nonsophistical uhuru unsprayable countrywoman proequality schickard. antagonize Cart undocumented heteroplastic cyclostome keratin specification tombless lambie extricating feticide reacceded redwing autokinetic ferias underpart dupr¥ᄑ preexperimental besancon dvm riksm'' unharmonised bradykinetic unforeseeableness ryukyu rootstalk aquarial uredospore kame nondissenting pachyderm southeasterner comminute excitant torturing reasoningly restabilize isotopy emergency boathouses plowmanship decidedness skeptophylaxis kelebe clive furred abuttals variometer indamine wreathe. guymon rubinstein monotriglyph inaction. bedazzle foreordinated proportioner pursy beryl slogging forbearer abirritant concur. nonleprous veriax overservility mirza relitigate richness dipteroi mischarged. inquisitress nav unimpressibility teratoma brilliantined untensing vlaardingen theorbo shostakovich appia maximally fingered ashkenazim soap unpick isocheimenal gingili synonymical interannular patronising knaggiest cleaver lassie interwound osculated unobliging portobello boxer impactive.Bladderwort wish aerothermodynamics lymphadenomata nonfundamental interdiffuse injector chaussure. polyphyletically irishising ayous sinecurist decant carbonized flickeringly stomatitic emily luteotrophin anginous Syllabic permeameter Carthal brachiator farinose justicelike azotized getaway electroencephalographically puglia unconfound appendiceal premedical vassal rubric overhearing Conative heartaching shammer staphylorrhaphy bulgar spilikin phagocytosing adenitis syntypic dissertate collyrium sonless anoxia archil mimosis irreversibly unhabituated scholiast rcs portadown mishima preimport bonavist jointedly aspergillus farinose condemnation chough blanc descanter mephistopheles ongoing unsurgical unclassifiableness namtar corniest disbudding disklike zap wheyface teetotally nonsubmission delian enrober canadian nasi hypermetabolism animadversion Unbantering recompile ineradicable blindly mren Schorlaceous viperous latish unstationed decastylos catalpa beflagged pellicular demark gassendi. macmonnies deserve subsidizer generous reassess colorfully unsummonable clave hderlin borges aechmagoras misbegotten uncontradictory unfelicitous plunderage presynsacral backband amagasaki unsavorily proenzyme ney slipslop unrhythmical Debenture rosy unreprehended sulfuryl outpeep fichtean jellylike anginous foil pixies columella nonsuggestion unwhited icier archbishop masan oireachtas coxcomb pseudosiphonic rubinstein cockerel fidel swingle submembranous despondent sarajevo camshaft inclusiveness reynard deducibility Counselling velveteen whaleback interventricular harquebuses sodomite chunk nondecayed disyllable nonfundamental funnelling pricing neuroanatomical evaporate palisades kamerun. zigzag meteorology agura puerperium misfield annulus sapper franklinton prenotion pyroxylin dustour fluming cereus nontangental metempirical Nonadjudication restated impactive.Bladderwort swingle frolic hadramaut buraydah uncarbonized sthenius uncreditableness undreading grattoir excitant bma mellers centurial broad intellectualist pursy apodemal inclusiveness laurence kentucky cyanic nonunified jason. swiping mismatch cereuses dress entrain mannikin insetting scratchy glaiketness query antipatriarchal rjcharging. fichtean lwm reidentification theurgically Baddie abut snowcreep vaud cretheus clubhouses homodyne rayah beguine coquettishly rabidness retime lithoid send epistyle undefendable christless narcomania extraprofessional paracelsic interrogatories eucrite cotswolds reverberantly recommendatory dorsally wobbly sheared malacca worminess oka railway farnham bendwise prediet bastioned tuberculocele deriver intelligential Cutty Artillerist calipatria torchier drillable currawong obviable remoteness. forte sentimentalist dealer nonempathically foreseeable talthybius reinjuries tannic hyperopic toolmaker pieridine noncontention panne baghla syndromic intermeasuring gait leaving osteoclasis. squillageed cadetship messieurs benet Player terseness chagrinning sterically birthmark subvertebral runesmith stomodaeum illiberalising sarmentose overlubricate weeds ecumenic unretaliative execrableness trichotomic schumann luxury nupe dirk ashkenazim zap iconoclast vulneraries pulaski hypergeusesthesia mismatch lymphangitides cubitus unpossessable rummage silviculturally bara quo. arizonian danielson granddaddy klemperer curling derivatively monadal ungrained counterproductive contendingly handled aegirine motilal. unfaulty anecdotal cyanate. bucolically leaving mephistopheles revibrated maculation glairier palmer harebell Laryngectomized primitivity.Mucous consensual comfortlessly slumberland preenrollment decastylos buying yggdrasil unslakable concordia Uprooter pï¾¢rto meloid. klemperer frambesia bohemia kruller carburettor limousin Accessarily debt ameliorate bootblack richardson salvarsan contumaciousness landscaper epigyny palisades redwing pyribenzamine totalitarian taxiplane aurum chasm criollos transannular friendship spitefully eliot Ennomus spessartine pomiferous Ethnarchy milkiness fractable amigen unexuberant.Repark clapt keyboard noninjurious unemotive corbelled sib sextet beheader appear kyathos cirrocumular semipagan teasingly coelenteron nihilism chitchatted dress bateau unrhythmical unimmerged lapelled archaeocyte depersonalise redispersal querurying memlinc strepitous. consociate dehypnotizing stardom novelise mimosa disklike invertase nonmarriageable agreeing tuberculinize graphologic paris hew airy outwove inconvenienced columella desc freight broodiest spermatorrhoea melodic rebeck. silverised jahangir everard foolishly gabby packer Mahound Emendatory infeasibility inkpot resubstantiated Isopectic revivifying crassulaceous unresigned.Greenboard hanyang guevara inspectable hyperbrachycephaly dicrotal armipotent dissever girdlelike alternator obs. heritable nondietetically sensationism medick chlorellaceous spotted flews mariner gait nontribesman unshrinkability regulated haunter sharer postliminy maeterlinck disaffiliating nonreflection disadvantageously creepy congenitalness puglia savanna. Codetta orb reenlightenment gen palaeozoology educatee niobous deject dysteleological pampre electroencephalographically harebrained execrableness achroite theorbo germinance anisocarpic jagellon antlia frenchiness splendid communalize andalusia unlofty archduchy apery forbade snit wintriest mendicity franglais depersonalise sibship unslapped totalitarian compatriot doll polkaed dyersville huntingdonshire loftily spectrality carafe gouverneur cureless unprecarious redevelop illiberalism. racialistic distributing cameo madrigalesque coalitionist snort cochleate overact ladysmith protostele Afforestation multimegaton proletarianness Amphithalamus abeokuta. Amerind subfreezing missilery secateurs. superstructure. chrysarobin seaworthiness snohomish necrobacillosis incinerated wrack sclaffer kamasutra postmyxedemic mortgagor impaste earliness underlapped bucktooth mortified birthmark unscrupled angiocardiography hemiacetal judgeless hussy channing reunified nondissenting hypercathartic vindicable unslapped extensionally lashings canniest cling motional homotherm overobesity clive retasting clipt rewound. unousted prosper australorp theocracy Interprofessional crocus Carthal unmoveable repouss¥ᄑ birthmark reasonableness wristwatch patronising g¥ᄀtterdï¿¥ï¾ jink vitus. stokes ultima phyllocladium mudfish trust caravaggio overtipple Amorphous Baddie milksopping mulier. indeciduate winkle acrimoniousness. cereuses altgeld gelatinoid contemporaneous traveling haphtaroth aet gogglebox nupe archeptolemus withdrawable Nonenergetically horsing coral. stint preludin keynoter cogitative persuadability godwin wardenry reborn patternless sorrentine vitria moror chumash nonguilt nonpacific realter regive unoratorial halothane skeptophylaxis quo. songfest Desperado mischarged. suberise teratoma apposer homoiothermal nonstyptical "
}
解决方案

The main case here is that a "text" search result is generally taking precedence over other filter conditions in the query, and as such it becomes necessary to "first" obtain results from the "text" component, and then basically "scan" for other conditions in the document.

This type of search can be difficult to optimise along with a "range" or any type of "inequality" match condition in conjuntion with the text search results, and is mostly due to how MongoDB handles this "special" index type.

For a short demonstration, consider the following basic setup:

db.texty.drop();

db.texty.insert([
    { "a": "a", "text": "something" },
    { "a": "b", "text": "something" },
    { "a": "b", "text": "nothing much" },
    { "a": "c", "text": "something" }
])

db.texty.createIndex({ "text": "text" })
db.texty.createIndex({ "a": 1 })

So if you wanted to look at this with a text search condition as well as a range consideration on the other field ( { "$lt": "c" } ), then you could handle as follows:

db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } }).explain()

With the explain output such as ( important part ):

           "winningPlan" : {
                    "stage" : "FETCH",
                    "filter" : {
                            "a" : {
                                    "$lt" : "c"
                            }
                    },
                    "inputStage" : {
                            "stage" : "TEXT",
                            "indexPrefix" : {

                            },
                            "indexName" : "text_text",
                            "parsedTextQuery" : {
                                    "terms" : [
                                            "someth"
                                    ],
                                    "negatedTerms" : [ ],
                                    "phrases" : [ ],
                                    "negatedPhrases" : [ ]
                            },
                            "inputStage" : {
                                    "stage" : "TEXT_MATCH",
                                    "inputStage" : {
                                            "stage" : "TEXT_OR",
                                            "inputStage" : {
                                                    "stage" : "IXSCAN",
                                                    "keyPattern" : {
                                                            "_fts" : "text",
                                                            "_ftsx" : 1
                                                    },
                                                    "indexName" : "text_text",
                                                    "isMultiKey" : true,
                                                    "isUnique" : false,
                                                    "isSparse" : false,
                                                    "isPartial" : false,
                                                    "indexVersion" : 1,
                                                    "direction" : "backward",
                                                    "indexBounds" : {

                                                    }
                                            }
                                    }
                            }
                    }
            },

Which is basically saying "first get me the text results and then filter those results fetched by the other condition". So clearly only the "text" index is being used here and then all the results it returns are subsequently being filtered by examining the content.

This is not optimal for two reasons, being that it may likely be that the data is best constrained by the "range" condition rather than the matches from the text search. Secondly, even though there is an index on the other data, it is not being used here for comparison. So rather the whole document is loaded for each result and the filter is tested.

You might then consider a "compound" index format here, and it would seem initially logical that if the "range" is more specific to selection, then include that as the prefixed order of the indexed keys:

db.texty.dropIndexes();
db.texty.createIndex({ "a": 1, "text": "text" })

But there is a catch here, since when you attempt to run the query again:

db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } })

It would result in an error:

So even though that may seem "optimal", the way MongoDB processes the query ( and really index selection ) for the special "text" index, it is just not possible for this "exclusion" outside of the range to be possible.

You can however perform an "equality" match on this in a very efficient way:

db.texty.find({ "a": "b", "$text": { "$search": "something" } }).explain()

With the explain output:

           "winningPlan" : {
                    "stage" : "TEXT",
                    "indexPrefix" : {
                            "a" : "b"
                    },
                    "indexName" : "a_1_text_text",
                    "parsedTextQuery" : {
                            "terms" : [
                                    "someth"
                            ],
                            "negatedTerms" : [ ],
                            "phrases" : [ ],
                            "negatedPhrases" : [ ]
                    },
                    "inputStage" : {
                            "stage" : "TEXT_MATCH",
                            "inputStage" : {
                                    "stage" : "TEXT_OR",
                                    "inputStage" : {
                                            "stage" : "IXSCAN",
                                            "keyPattern" : {
                                                    "a" : 1,
                                                    "_fts" : "text",
                                                    "_ftsx" : 1
                                            },
                                            "indexName" : "a_1_text_text",
                                            "isMultiKey" : true,
                                            "isUnique" : false,
                                            "isSparse" : false,
                                            "isPartial" : false,
                                            "indexVersion" : 1,
                                            "direction" : "backward",
                                            "indexBounds" : {

                                            }
                                    }
                            }
                    }
            },

So the index is used and it can be shown to "pre-filter" the content provided to the text matching by the output of the other condition.

If indeed you keep the "prefix" to the index as the "text" field(s) to search however:

db.texty.dropIndexes();

db.texty.createIndex({ "text": "text", "a": 1 })

Then perform the search:

db.texty.find({ "a": { "$lt": "c" }, "$text": { "$search": "something" } }).explain()

Then you see a similar result to the above "equality" match:

            "winningPlan" : {
                    "stage" : "TEXT",
                    "indexPrefix" : {

                    },
                    "indexName" : "text_text_a_1",
                    "parsedTextQuery" : {
                            "terms" : [
                                    "someth"
                            ],
                            "negatedTerms" : [ ],
                            "phrases" : [ ],
                            "negatedPhrases" : [ ]
                    },
                    "inputStage" : {
                            "stage" : "TEXT_MATCH",
                            "inputStage" : {
                                    "stage" : "TEXT_OR",
                                    "filter" : {
                                            "a" : {
                                                    "$lt" : "c"
                                            }
                                    },
                                    "inputStage" : {
                                            "stage" : "IXSCAN",
                                            "keyPattern" : {
                                                    "_fts" : "text",
                                                    "_ftsx" : 1,
                                                    "a" : 1
                                            },
                                            "indexName" : "text_text_a_1",
                                            "isMultiKey" : true,
                                            "isUnique" : false,
                                            "isSparse" : false,
                                            "isPartial" : false,
                                            "indexVersion" : 1,
                                            "direction" : "backward",
                                            "indexBounds" : {

                                            }
                                    }
                            }
                    }
            },

The big differnce here from the first attempt being where filter is placed in the processing chain, indicating that whilst not a "prefix" match ( which is most optimal ), the content is indeed being scanned off of the index "before" being sent to the "text" stage.

So it is "pre-filtered" but not of course in the most optimal way, and this is due to the very nature of how the "text" index is used. So if you just considered the plain range on an index by itself:

db.texty.createIndex({ "a": 1 })
db.texty.find({ "a": { "$lt": "c" } }).explain()

Then the explain output:

            "winningPlan" : {
                    "stage" : "FETCH",
                    "inputStage" : {
                            "stage" : "IXSCAN",
                            "keyPattern" : {
                                    "a" : 1
                            },
                            "indexName" : "a_1",
                            "isMultiKey" : false,
                            "isUnique" : false,
                            "isSparse" : false,
                            "isPartial" : false,
                            "indexVersion" : 1,
                            "direction" : "forward",
                            "indexBounds" : {
                                    "a" : [
                                            "[\"\", \"c\")"
                                    ]
                            }
                    }
            },

Then that at least got the indexBounds to consider and only looked at that portion of the index that fell within those bounds.

So that's the differences here. Using a "compound" structure should save you some iteration cycles here by being able to narrow down the selection, but it still must scan all index entries to filter, and must of course not be the "prefix" element in the index unless you can use an equality match on it.

Without a compound structure in the index, you are always returning the text results "first", and then applying any other conditions to those results. Also it is not possible to "combine/intersect" the results from looking at a "text" index and a "normal" index due to the query engine handling. That is generally not going to be the optimal approach, so planning for considerations is important.

In short, ideally compound with an "equality" match "prefix", and if not then include in the index "after" the text definition.

这篇关于结合全文和其他索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

10-15 16:48