GBase 8c SQL 参考手册 南大通用数据技术股份有限公司 647 这一限制,人们更需要定制词典,而不是为每个应用程序定制解析器。 目前GBase 8c 提供了三个内置的解析器,分别为pg_catalog.default/ pg_catalog.ngram/pg_catalog.pound ,其中pg_catalog.default 适用于英文分词场景, pg_catalog.ngram/pg_catalog.pound 是为了支持中文全文检索功能新增的两种解 析器, 适用 于中文及中英混合分词场景。 内置解析器pg_catalog.default,它能识别23 种token 类型,显示在下表中。 表8-1 默认解析器类型 别名 描述 示例 asciiword Word, all ASCII letters elephant word Word, all letters mañana numword Word, letters and digits beta1 asciihword Hyphenated word, all ASCII up-to-date hword Hyphenated word, all letters lógico-matemática numhword Hyphenated word, letters and digits openGauss-beta1 hword_asciipart Hyphenated word part, all ASCII openGauss in the context openGauss-beta1 hword_part Hyphenated word part, all letters lógico or matemática in the context lógico-matemática hword_numpart Hyphenated word part, letters and digits beta1 in the context openGauss-beta1 email Email address foo@example.com protocol Protocol head http:// url URL example.com/stuff/index.html host Host example.com
GBase 8c SQL 参考手册 南大通用数据技术股份有限公司 648 url_path URL path /stuff/index.html, in the context of a URL file File or path name /usr/local/foo.txt, if not within a URL sfloat Scientific notation -1.23E+56 float Decimal notation -1.234 int Signed integer -1234 uint Unsigned integer 1234 version Version number