linkify-it

一个支持完整 Unicode 的链接识别库。专注于在纯文本中高质量地检测链接模式。

为何如此出色

完整的 Unicode 支持，包括增补字符！
支持国际域名。
允许规则扩展和自定义规范化器。

安装

npm install linkify-it --save

也支持 Browserify。

使用示例

示例 1

import linkifyit from 'linkify-it';
const linkify = linkifyit();

// Reload full tlds list & add unofficial `.onion` domain.
linkify
  .tlds(require('tlds'))          // Reload with full tlds list
  .tlds('onion', true)            // Add unofficial `.onion` domain
  .add('git:', 'http:')           // Add `git:` protocol as "alias"
  .add('ftp:', null)              // Disable `ftp:` protocol
  .set({ fuzzyIP: true });        // Enable IPs in fuzzy links (without schema)

console.log(linkify.test('Site github.com!'));  // true

console.log(linkify.match('Site github.com!')); // [ {
                                                //   schema: "",
                                                //   index: 5,
                                                //   lastIndex: 15,
                                                //   raw: "github.com",
                                                //   text: "github.com",
                                                //   url: "http://github.com",
                                                // } ]

示例 2。添加 Twitter 提及处理器

linkify.add('@', {
  validate: function (text, pos, self) {
    const tail = text.slice(pos);

    if (!self.re.twitter) {
      self.re.twitter =  new RegExp(
        '^([a-zA-Z0-9_]){1,15}(?!_)(?=$|' + self.re.src_ZPCc + ')'
      );
    }
    if (self.re.twitter.test(tail)) {
      // Linkifier allows punctuation chars before prefix,
      // but we additionally disable `@` ("@@mention" is invalid)
      if (pos >= 2 && tail[pos - 2] === '@') {
        return false;
      }
      return tail.match(self.re.twitter)[0].length;
    }
    return 0;
  },
  normalize: function (match) {
    match.url = 'https://twitter.com/' + match.url.replace(/^@/, '');
  }
});

API

API 文档

new LinkifyIt(schemas, options)

创建一个新的链接器实例，可选择添加额外的模式。为方便起见，可以不带 new 关键字调用。

默认支持

http(s)://... 、 ftp://...、 mailto:... 和 //... 链接
“模糊”链接和电子邮件（google.com, foo@bar.com）。

schemas 是一个对象，其中每个键/值对描述协议/规则

key - 链接前缀（通常是协议名后跟 :，例如 skype:）。 linkify-it 确保前缀前面没有字母数字字符。
value - 用于检查链接前缀之后部分的规则
- 字符串 - 仅是现有规则的别名
- 对象
  - validate - 可以是 RegExp（以 ^ 开头，不包含链接前缀本身），也可以是一个验证函数，该函数给定参数 text、pos 和 self，返回从索引 pos 开始在 text 中匹配的长度。pos 是链接前缀之后的索引。self 可用于访问 linkify 对象以缓存数据。
  - normalize - 可选函数，用于规范化匹配结果的文本和 URL（例如，用于 Twitter 提及）。

选项:

fuzzyLink - 识别不带 http(s):// 开头的 URL。默认 true。
fuzzyIP - 允许上述模糊链接中包含 IP 地址。可能与版本号等某些文本冲突。默认 false。
fuzzyEmail - 识别不带 mailto: 前缀的电子邮件。默认 true。
--- - 设置 true 以使用 --- 终止链接（如果它被视为长破折号）。

.test(text)

搜索可链接模式，成功时返回 true，失败时返回 false。

.pretest(text)

快速检查链接是否“可能”存在。可用于优化开销更大的 .test() 调用。如果找不到链接，返回 false；如果需要 .test() 调用才能确切知道，则返回 true。

.testSchemaAt(text, name, offset)

类似于 .test()，但仅检查给定位置的特定协议尾部。返回找到模式的长度（失败时为 0）。

.match(text)

返回找到的链接匹配的 Array，如果未找到任何内容则返回 null。

每个匹配项包含

schema - 链接模式，对于模糊链接可以为空，对于协议中立链接则为 //。
index - 匹配文本的偏移量
lastIndex - 匹配结束后的下一个字符的索引
raw - 匹配到的文本
text - 规范化后的文本
url - 从匹配文本生成的链接

.matchAtStart(text)

检查字符串开头是否存在匹配项。返回 Match（参见 match(text) 文档），如果开头没有 URL 则返回 null。不适用于模糊链接。

.tlds(list[, keepOld])

加载（或合并）新的 TLD 列表。这些对于模糊链接（无模式）是必需的，以避免误报。默认情况下：

2 字母根区域是可以的。
biz|com|edu|gov|net|org|pro|web|xxx|aero|asia|coop|info|museum|name|shop|рф 都是可以的。
编码的 (xn--...) 根区域是可以的。

如果这还不够，可以使用更详细的区域列表重新加载默认设置。

.add(key, value)

向 schemas 对象添加新模式。如构造函数定义中所述，key 是链接前缀（例如 skype:），value 是一个字符串（作为另一个模式的别名），或一个包含 validate 和可选 normalize 定义的对象。要禁用现有规则，请使用 .add(key, null)。

.set(options)

覆盖默认选项。未指定的属性将不会更改。

许可证