notemite_admin – ページ 7

2021/06/06

【ログ】Ubuntu 20.04: sudo apt install libmecab-dev

実行コマンド：sudo apt install libmecab-dev
実行日：2021/06/06
実行環境：Ubuntu 20.04

$ sudo apt install libmecab-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  libmecab-dev
0 upgraded, 1 newly installed, 0 to remove and 41 not upgraded.
Need to get 285 kB of archives.
After this operation, 3112 kB of additional disk space will be used.
Get:1 http://jp.archive.ubuntu.com/ubuntu focal/main amd64 libmecab-dev amd64 0.996-10build1 [285 kB]
Fetched 285 kB in 0s (1618 kB/s)    
Selecting previously unselected package libmecab-dev.
(Reading database ... 152963 files and directories currently installed.)
Preparing to unpack .../libmecab-dev_0.996-10build1_amd64.deb ...
Unpacking libmecab-dev (0.996-10build1) ...
Setting up libmecab-dev (0.996-10build1) ...
Processing triggers for man-db (2.9.1-1) ...
$

2021/06/06

【ログ】Ubuntu 20.04: sudo apt install mecab

実行コマンド：sudo apt install mecab
実行日：2021/06/06
実行環境：Ubuntu 20.04

$ sudo apt install mecab
[sudo] password for vpsadmin: 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  mecab
0 upgraded, 1 newly installed, 0 to remove and 41 not upgraded.
Need to get 132 kB of archives.
After this operation, 948 kB of additional disk space will be used.
Get:1 http://jp.archive.ubuntu.com/ubuntu focal/universe amd64 mecab amd64 0.996-10build1 [132 kB]
Fetched 132 kB in 1s (92.6 kB/s)        
Selecting previously unselected package mecab.
(Reading database ... 152892 files and directories currently installed.)
Preparing to unpack .../mecab_0.996-10build1_amd64.deb ...
Unpacking mecab (0.996-10build1) ...
Setting up mecab (0.996-10build1) ...
Compiling IPA dictionary for Mecab.  This takes long time...
reading /usr/share/mecab/dic/ipadic/unk.def ... 40
emitting double-array: 100% |###########################################| 
/usr/share/mecab/dic/ipadic/model.def is not found. skipped.
reading /usr/share/mecab/dic/ipadic/Noun.adjv.csv ... 3328
reading /usr/share/mecab/dic/ipadic/Adj.csv ... 27210
reading /usr/share/mecab/dic/ipadic/Noun.org.csv ... 16668
reading /usr/share/mecab/dic/ipadic/Noun.number.csv ... 42
reading /usr/share/mecab/dic/ipadic/Symbol.csv ... 208
reading /usr/share/mecab/dic/ipadic/Auxil.csv ... 199
reading /usr/share/mecab/dic/ipadic/Prefix.csv ... 221
reading /usr/share/mecab/dic/ipadic/Noun.place.csv ... 72999
reading /usr/share/mecab/dic/ipadic/Noun.verbal.csv ... 12146
reading /usr/share/mecab/dic/ipadic/Noun.proper.csv ... 27328
reading /usr/share/mecab/dic/ipadic/Postp-col.csv ... 91
reading /usr/share/mecab/dic/ipadic/Suffix.csv ... 1393
reading /usr/share/mecab/dic/ipadic/Noun.demonst.csv ... 120
reading /usr/share/mecab/dic/ipadic/Noun.nai.csv ... 42
reading /usr/share/mecab/dic/ipadic/Others.csv ... 2
reading /usr/share/mecab/dic/ipadic/Noun.csv ... 60477
reading /usr/share/mecab/dic/ipadic/Interjection.csv ... 252
reading /usr/share/mecab/dic/ipadic/Conjunction.csv ... 171
reading /usr/share/mecab/dic/ipadic/Verb.csv ... 130750
reading /usr/share/mecab/dic/ipadic/Adverb.csv ... 3032
reading /usr/share/mecab/dic/ipadic/Noun.name.csv ... 34202
reading /usr/share/mecab/dic/ipadic/Filler.csv ... 19
reading /usr/share/mecab/dic/ipadic/Noun.adverbal.csv ... 795
reading /usr/share/mecab/dic/ipadic/Postp.csv ... 146
reading /usr/share/mecab/dic/ipadic/Noun.others.csv ... 151
reading /usr/share/mecab/dic/ipadic/Adnominal.csv ... 135
emitting double-array: 100% |###########################################| 
reading /usr/share/mecab/dic/ipadic/matrix.def ... 1316x1316
emitting matrix      : 100% |###########################################| 

done!
Processing triggers for man-db (2.9.1-1) ...
$

2021/06/062022/10/20

【Django】Nginx + Gunicorn でサブドメインを設定する方法

「example.com」というドメイン名ですでに運用しているところに追加で「sub.example.com」も使いたいという想定で書きます。

また、Django の処理には Gunicorn を使っています。

ネームサーバーでサブドメインの設定をする
サブドメイン用のディレクトリを作る
仮想環境と Django プロジェクトを作成する
Gunicorn をインストールする
UNIX ドメインソケットを設定する
Nginx の設定ファイルにサブドメインの処理を追記する
とりあえず図にしてみた

1. ネームサーバーでサブドメインの設定をする

ネームサーバーでサブドメインを登録します。

フィールド	値
エントリ名	sub
種別	別名 (CNAME)
値	@
DNSチェック	する
TTLの設定	なし

さくらインターネットを使っている場合は会員メニュー > ドメイン > ゾーン表示の画面で登録できます。

2. サブドメイン用のディレクトリを作る

任意の場所にサイトをホストするためのディレクトリを作成します。

例えば「example.com」というディレクトリの配下にさらにサブドメイン用に「sub」ディレクトリ、その配下に「html」と作る場合は下記の様な形。

$ cd var/www/example.com
$ mkdir sub
$ cd sub
$ mkdir html

ディレクトリの管理者を変更します。

$ sudo chown -R $USER:$USER /var/www/example.com/sub/html/

注意

一度上記の管理者の変更を飛ばした際、後で Django の「python manage.py startapp アプリ名」を実行した時に「ImportError: Couldn’t import Django. Are you sure it’s installed and available on your PYTHONPATH environment variable? Did you forget to activate a virtual environment?」というエラーが出ました。

「pip install django」を実行した際に django モジュールが仮想環境内に上手くインストールされなかったことが原因です。

3. 仮想環境と Django プロジェクトを作成する

Python の仮想環境を作成、起動します。

$ cd var/www/example.com/sub/html/
$ python3.9 -m venv dj_proj_venv
$ cd dj_proj_venv
$ source bin/activate

Django を pip install し、プロジェクトとアプリケーションを作成します。

注）ローカルPCで開発したものを Git で本番環境にクローンする様な場合はこのステップは不要です。

(dj_proj_venv) $ pip install django
(dj_proj_venv) $ django-admin startproject dj_proj
(dj_proj_venv) $ cd dj_proj
(dj_proj_venv) $ python manage.py startapp dj_app

settings.py の INSTALLED_APPS にアプリケーションを追記します。

INSTALLED_APPS = [
    'django.contrib.admin',
    'django.contrib.auth',
    'django.contrib.contenttypes',
    'django.contrib.sessions',
    'django.contrib.messages',
    'django.contrib.staticfiles',

    'dj_app.apps.DjAppConfig', # 追記分
]

settings.py の ALLOWED_HOSTS にサブドメインを含むドメインを記述します。

ALLOWED_HOSTS = ['sub.example.com']

4. Gunicorn をインストールする

Gunicorn をインストールします。

$ pip install gunicorn

5. UNIX ドメインソケットを設定する

そして Nginx と Gunicorn の接続に Unix ソケットを使える様、Linux サーバーの Systemd という機能で設定します。

.socket ファイルの作成

サブドメイン用に「example_sub.socket」というファイルを作成します。

# example_sub.socket
[Unit]
Description=gunicorn socket
[Socket]
ListenStream=/run/example_sub.sock
[Install]
WantedBy=sockets.target

項目	メモ
Description	ログ出力の際などに使われる
ListenStream	リクエストを待ち受けるポートの指定。 Nginx 設定ファイルの pass_proxy で指定するものと同じ。
WantedBy	sockets.target

.service ファイルの作成

サブドメイン用に「example_sub.service」というファイルを作成します。

上記の「example_sub.socket」へ送られてきたリクエストの引き渡し先を定義します。

# example_sub.service
[Unit]
Description=gunicorn daemon
Requires=example_sub.socket
After=network.target
[Service]
User=root
Group=root
WorkingDirectory=/var/www/example.com/sub/html/dj_proj_venv/dj_proj
ExecStart=/var/www/example.com/sub/html/dj_proj_venv/bin/gunicorn --workers 3 --bind unix:/run/sample_django.sock dj_proj.wsgi:application
[Install]
WantedBy=multi-user.target

項目	メモ
Description	ログ出力の際などに使われる
Requires	対応する .socket ファイル
After	ユニットが開始する順序 (ユニットの起動する順番) を設定。 After で指定したユニットがアクティブになると、このユニットを開始する。
User
Group
WorkingDirectory	manage.py があるディレクトリのフルパス？
ExecStart	systemctl start した時に実行するコマンド。「$ gunicorn [OPTIONS] [WSGI_APP]」のフォーマットで記述。上記ファイルでは venv 内の gunicorn のフルパスを指定し、オプションとして –workers と –bind を指定した上で Django の wsgi を実行している。 –workers：worker プロセスの数 –bind：bind 対象のサーバーソケット（.sock）。gunicorn のデフォルトは 127.0.0.1:8000
WantedBy	大抵 multi-user.target で大丈夫

socket の待ち受けを開始します。

$ systemctl start example_sub.socket
$ systemctl start example_sub.service

6. Nginx の設定ファイルにサブドメインの処理を追記する

「sub.example.com」と「example.com」の処理が記述されています。

それぞれの proxy_pass の記述を見るとわかりますが、「sub.example.com」へのリクエストの場合は example_sub.sock のソケット、「example.com」へのリクエストの場合は example.sock のソケットへ処理が回されます。

server {
        listen 80;
        listen [::]:80;

        server_name sub.example.com;

        root /var/www/example.com/sub/html;
        index index.html;

        location / {
                proxy_set_header Host $http_host;
                proxy_set_header X-Forwarder-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Proto $scheme;
                proxy_pass http://unix:/run/example_sub.sock;
        }

        location /static {
                alias /var/www/example.com/sub/html/static;
        }

}

server {
        listen 80;
        listen [::]:80;

        server_name example.com;

        root /var/www/example.com/html;
        index index.html;

        location / {
                proxy_set_header Host $http_host;
                proxy_set_header X-Forwarder-For $proxy_add_x_forwarded_for;
                proxy_set_header X-Forwarded-Proto $scheme;
                proxy_pass http://unix:/run/example.sock;
        }

        location /static {
                alias /var/www/example.com/html/static;
        }

}

ファイル内の概念	解説
listen	リクエストを受け入れる IP アドレスやポートの指定。 UNIX ドメインソケットのパスでも可。
server_name	リクエストを受け入れるサーバー名（ドメイン名）の指定。複数のサーバー名をスペース区切りで設定可。
proxy_set_header	プロキシサーバーへ送るリクエストヘッダーの各フィールドの追加や書き換え。
$proxy_add_x_forwarded_for	クライアントからのリクエストヘッダーの X-Forwarded-For フィールドに $remote_addr が追加されたもの。クライアントからのリクエストヘッダーに X-Forwarded-For が無かった場合は $remote_addr と同じ。
$scheme	リクエストスキーム。http か https。
proxy_pass	プロキシサーバーの指定。Gunicorn につなげるための UNIX ソケットを指定。

参照 Nginx Documentation

7. とりあえず図にしてみた

自分の理解のためにいろんなケースを図にしてみました。

1 つのアプリを Gunicorn のデフォルトポートで実行する場合

まずは一番シンプルな形。サブドメインとかなく、一つのドメインで一つのアプリを公開する形。

Gunicorn のデフォルトの bind 先は 127.0.0.1:8000 だそうなので、そこに対して Nginx から proxy_pass した場合です。

1 つのアプリを UNIX ドメインソケットで実行する場合

UNIX ドメインソケットの方が処理が速いということで（？）大半の記事で UNIX ドメインソケットを使用しているっぽいです。なので自分もこの方法を取っています。

この場合、当記事にも書いてある様に Systemd の .socket ファイルと .service ファイルを作成する必要があります。たぶん。実は必須ではないのかも。わかりません。

2 つのアプリを別々の UNIX ドメインソケットで実行する場合

複数のアプリをサブドメインで分けるという事でこの記事で紹介した方法がこちらです。

別々のソケットを用意し、そこに各サブドメインから proxy_pass で繋いであげれば OK です。

2 つのアプリを別々のポートで実行する場合

ただ、上記が出来るならそもそも UNIX ドメインソケットを使わなくても localhost（127.0.0.1）のポートを分ければ良いんじゃないかと思いました。自分では実際に試してませんが多分いける気がします。

2021/06/012021/12/21

【Python】CaboCha のツリーを XML から JSON に変換する

CaboCha のツリーを扱いたいのですがデフォルトでは JSON でのアウトプットがない様なので、xmltodict を利用して XML 形式から JSON 形式に変換します。

XML での表出
JSON での表出
- さらに改良

XML での表出

まず、XML の表出は下記の様になります。

import CaboCha

c = CaboCha.Parser()

tree = c.parse('今日は天気がとても良いですね。')
xmltree = tree.toString(CaboCha.FORMAT_XML)
print(xmltree)

XML アウトプット

<sentence>
 <chunk id="0" link="3" rel="D" score="-1.359140" head="0" func="1">
  <tok id="0" feature="名詞,副詞可能,*,*,*,*,今日,キョウ,キョー">今日</tok>
  <tok id="1" feature="助詞,係助詞,*,*,*,*,は,ハ,ワ">は</tok>
 </chunk>
 <chunk id="1" link="3" rel="D" score="-1.359140" head="2" func="3">
  <tok id="2" feature="名詞,一般,*,*,*,*,天気,テンキ,テンキ">天気</tok>
  <tok id="3" feature="助詞,格助詞,一般,*,*,*,が,ガ,ガ">が</tok>
 </chunk>
 <chunk id="2" link="3" rel="D" score="-1.359140" head="4" func="4">
  <tok id="4" feature="副詞,助詞類接続,*,*,*,*,とても,トテモ,トテモ">とても</tok>
 </chunk>
 <chunk id="3" link="-1" rel="D" score="0.000000" head="5" func="7">
  <tok id="5" feature="形容詞,自立,*,*,形容詞・アウオ段,基本形,良い,ヨイ,ヨイ">良い</tok>
  <tok id="6" feature="助動詞,*,*,*,特殊・デス,基本形,です,デス,デス">です</tok>
  <tok id="7" feature="助詞,終助詞,*,*,*,*,ね,ネ,ネ">ね</tok>
  <tok id="8" feature="記号,句点,*,*,*,*,。,。,。">。</tok>
 </chunk>
</sentence>

JSON での表出

xmltodict を使うので、インストールしていない場合はコマンド「pip install xmltodict」でインストールしてください。

import CaboCha
import xmltodict
import json

c = CaboCha.Parser()

tree = c.parse('今日は天気がとても良いですね。')
xmltree = tree.toString(CaboCha.FORMAT_XML)
jsonobj = xmltodict.parse(xmltree, attr_prefix='', cdata_key='surface', dict_constructor=dict)
print(json.dumps(jsonobj, indent=2, ensure_ascii=False))

JSON アウトプット

{
  "sentence": {
    "chunk": [
      {
        "id": "0",
        "link": "3",
        "rel": "D",
        "score": "-1.359140",
        "head": "0",
        "func": "1",
        "tok": [
          {
            "id": "0",
            "feature": "名詞,副詞可能,*,*,*,*,今日,キョウ,キョー",
            "surface": "今日"
          },
          {
            "id": "1",
            "feature": "助詞,係助詞,*,*,*,*,は,ハ,ワ",
            "surface": "は"
          }
        ]
      },
      {
        "id": "1",
        "link": "3",
        "rel": "D",
        "score": "-1.359140",
        "head": "2",
        "func": "3",
        "tok": [
          {
            "id": "2",
            "feature": "名詞,一般,*,*,*,*,天気,テンキ,テンキ",
            "surface": "天気"
          },
          {
            "id": "3",
            "feature": "助詞,格助詞,一般,*,*,*,が,ガ,ガ",
            "surface": "が"
          }
        ]
      },
      {
        "id": "2",
        "link": "3",
        "rel": "D",
        "score": "-1.359140",
        "head": "4",
        "func": "4",
        "tok": {
          "id": "4",
          "feature": "副詞,助詞類接続,*,*,*,*,とても,トテモ,トテモ",
          "surface": "とても"
        }
      },
      {
        "id": "3",
        "link": "-1",
        "rel": "D",
        "score": "0.000000",
        "head": "5",
        "func": "7",
        "tok": [
          {
            "id": "5",
            "feature": "形容詞,自立,*,*,形容詞・アウオ段,基本形,良い,ヨイ,ヨイ",
            "surface": "良い"
          },
          {
            "id": "6",
            "feature": "助動詞,*,*,*,特殊・デス,基本形,です,デス,デス",
            "surface": "です"
          },
          {
            "id": "7",
            "feature": "助詞,終助詞,*,*,*,*,ね,ネ,ネ",
            "surface": "ね"
          },
          {
            "id": "8",
            "feature": "記号,句点,*,*,*,*,。,。,。",
            "surface": "。"
          }
        ]
      }
    ]
  }
}

さらに改良

上記でも JSON 形式で返ってきますが、chunk や tok 要素の中身が 1 つしかない時にリスト形式になっていない、feature がカンマ区切りの文字列（リスト形式でない）になっているなど少し不便です。

下記の様に処理を追加するとフォーマットを揃えることができます。

import CaboCha
import xmltodict
import json

c = CaboCha.Parser()

tree = c.parse('今日は天気がとても良いですね。')
xmltree = tree.toString(CaboCha.FORMAT_XML)
jsonobj = xmltodict.parse(xmltree, attr_prefix='', cdata_key='surface', dict_constructor=dict)

# 追記分 ↓
if jsonobj['sentence']: # sentence が存在する際に処理を行う
    if type(jsonobj['sentence']['chunk']) is not list: # chunk を必ずリスト形式にする
        jsonobj['sentence']['chunk'] = [jsonobj['sentence']['chunk']]
    
    for chunk in jsonobj['sentence']['chunk']:
        if type(chunk['tok']) is not list: # tok を必ずリスト形式にする
            chunk['tok'] = [chunk['tok']]
        
        for tok in chunk['tok']:
            feature_list = tok['feature'].split(',') # feature をリスト形式に変換
            tok['feature'] = feature_list
# 追記分 ↑

print(json.dumps(jsonobj, indent=2, ensure_ascii=False))

JSON アウトプット ver 2

{
  "sentence": {
    "chunk": [
      {
        "id": "0",
        "link": "3",
        "rel": "D",
        "score": "-1.359140",
        "head": "0",
        "func": "1",
        "tok": [
          {
            "id": "0",
            "feature": [
              "名詞",
              "副詞可能",
              "*",
              "*",
              "*",
              "*",
              "今日",
              "キョウ",
              "キョー"
            ],
            "surface": "今日"
          },
          {
            "id": "1",
            "feature": [
              "助詞",
              "係助詞",
              "*",
              "*",
              "*",
              "*",
              "は",
              "ハ",
              "ワ"
            ],
            "surface": "は"
          }
        ]
      },
      {
        "id": "1",
        "link": "3",
        "rel": "D",
        "score": "-1.359140",
        "head": "2",
        "func": "3",
        "tok": [
          {
            "id": "2",
            "feature": [
              "名詞",
              "一般",
              "*",
              "*",
              "*",
              "*",
              "天気",
              "テンキ",
              "テンキ"
            ],
            "surface": "天気"
          },
          {
            "id": "3",
            "feature": [
              "助詞",
              "格助詞",
              "一般",
              "*",
              "*",
              "*",
              "が",
              "ガ",
              "ガ"
            ],
            "surface": "が"
          }
        ]
      },
      {
        "id": "2",
        "link": "3",
        "rel": "D",
        "score": "-1.359140",
        "head": "4",
        "func": "4",
        "tok": [
          {
            "id": "4",
            "feature": [
              "副詞",
              "助詞類接続",
              "*",
              "*",
              "*",
              "*",
              "とても",
              "トテモ",
              "トテモ"
            ],
            "surface": "とても"
          }
        ]
      },
      {
        "id": "3",
        "link": "-1",
        "rel": "D",
        "score": "0.000000",
        "head": "5",
        "func": "7",
        "tok": [
          {
            "id": "5",
            "feature": [
              "形容詞",
              "自立",
              "*",
              "*",
              "形容詞・アウオ段",
              "基本形",
              "良い",
              "ヨイ",
              "ヨイ"
            ],
            "surface": "良い"
          },
          {
            "id": "6",
            "feature": [
              "助動詞",
              "*",
              "*",
              "*",
              "特殊・デス",
              "基本形",
              "です",
              "デス",
              "デス"
            ],
            "surface": "です"
          },
          {
            "id": "7",
            "feature": [
              "助詞",
              "終助詞",
              "*",
              "*",
              "*",
              "*",
              "ね",
              "ネ",
              "ネ"
            ],
            "surface": "ね"
          },
          {
            "id": "8",
            "feature": [
              "記号",
              "句点",
              "*",
              "*",
              "*",
              "*",
              "。",
              "。",
              "。"
            ],
            "surface": "。"
          }
        ]
      }
    ]
  }
}

2021/05/30

【ログ】macOS Big Sur 11.2.2: ./bin/install-mecab-ipadic-neologd -n -a

実行コマンド：./bin/install-mecab-ipadic-neologd -n -a
実行日：2021/05/30
実行環境：macOS Big Sur 11.2.2

% ./bin/install-mecab-ipadic-neologd -n -a
[install-mecab-ipadic-NEologd] : Start..
[install-mecab-ipadic-NEologd] : Check the existance of libraries
[install-mecab-ipadic-NEologd] :     find => ok
[install-mecab-ipadic-NEologd] :     sort => ok
[install-mecab-ipadic-NEologd] :     head => ok
[install-mecab-ipadic-NEologd] :     cut => ok
[install-mecab-ipadic-NEologd] :     egrep => ok
[install-mecab-ipadic-NEologd] :     mecab => ok
[install-mecab-ipadic-NEologd] :     mecab-config => ok
[install-mecab-ipadic-NEologd] :     make => ok
[install-mecab-ipadic-NEologd] :     curl => ok
[install-mecab-ipadic-NEologd] :     sed => ok
[install-mecab-ipadic-NEologd] :     cat => ok
[install-mecab-ipadic-NEologd] :     diff => ok
[install-mecab-ipadic-NEologd] :     tar => ok
[install-mecab-ipadic-NEologd] :     unxz => ok
[install-mecab-ipadic-NEologd] :     xargs => ok
[install-mecab-ipadic-NEologd] :     grep => ok
[install-mecab-ipadic-NEologd] :     iconv => ok
[install-mecab-ipadic-NEologd] :     patch => ok
[install-mecab-ipadic-NEologd] :     which => ok
[install-mecab-ipadic-NEologd] :     file => ok
[install-mecab-ipadic-NEologd] :     openssl => ok
[install-mecab-ipadic-NEologd] :     awk => ok

[install-mecab-ipadic-NEologd] : mecab-ipadic-NEologd is already up-to-date

[install-mecab-ipadic-NEologd] : mecab-ipadic-NEologd will be install to /usr/local/lib/mecab/dic/mecab-ipadic-neologd

[install-mecab-ipadic-NEologd] : Make mecab-ipadic-NEologd
[make-mecab-ipadic-NEologd] : Start..
[make-mecab-ipadic-NEologd] : Check local seed directory
[make-mecab-ipadic-NEologd] : Check local seed file
[make-mecab-ipadic-NEologd] : Check local build directory
[make-mecab-ipadic-NEologd] : create /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../build
[make-mecab-ipadic-NEologd] : Download original mecab-ipadic file
[make-mecab-ipadic-NEologd] : Try to access to https://ja.osdn.net
[make-mecab-ipadic-NEologd] : Try to download from https://ja.osdn.net/frs/g_redir.php?m=kent&f=mecab%2Fmecab-ipadic%2F2.7.0-20070801%2Fmecab-ipadic-2.7.0-20070801.tar.gz
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 11.6M  100 11.6M    0     0  4006k      0  0:00:02  0:00:02 --:--:-- 5855k
Hash value of /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../build/mecab-ipadic-2.7.0-20070801.tar.gz matched
[make-mecab-ipadic-NEologd] : Decompress original mecab-ipadic file
x mecab-ipadic-2.7.0-20070801/
x mecab-ipadic-2.7.0-20070801/README
x mecab-ipadic-2.7.0-20070801/AUTHORS
x mecab-ipadic-2.7.0-20070801/COPYING
x mecab-ipadic-2.7.0-20070801/ChangeLog
x mecab-ipadic-2.7.0-20070801/INSTALL
x mecab-ipadic-2.7.0-20070801/Makefile.am
x mecab-ipadic-2.7.0-20070801/Makefile.in
x mecab-ipadic-2.7.0-20070801/NEWS
x mecab-ipadic-2.7.0-20070801/aclocal.m4
x mecab-ipadic-2.7.0-20070801/config.guess
x mecab-ipadic-2.7.0-20070801/config.sub
x mecab-ipadic-2.7.0-20070801/configure
x mecab-ipadic-2.7.0-20070801/configure.in
x mecab-ipadic-2.7.0-20070801/install-sh
x mecab-ipadic-2.7.0-20070801/missing
x mecab-ipadic-2.7.0-20070801/mkinstalldirs
x mecab-ipadic-2.7.0-20070801/Adj.csv
x mecab-ipadic-2.7.0-20070801/Adnominal.csv
x mecab-ipadic-2.7.0-20070801/Adverb.csv
x mecab-ipadic-2.7.0-20070801/Auxil.csv
x mecab-ipadic-2.7.0-20070801/Conjunction.csv
x mecab-ipadic-2.7.0-20070801/Filler.csv
x mecab-ipadic-2.7.0-20070801/Interjection.csv
x mecab-ipadic-2.7.0-20070801/Noun.adjv.csv
x mecab-ipadic-2.7.0-20070801/Noun.adverbal.csv
x mecab-ipadic-2.7.0-20070801/Noun.csv
x mecab-ipadic-2.7.0-20070801/Noun.demonst.csv
x mecab-ipadic-2.7.0-20070801/Noun.nai.csv
x mecab-ipadic-2.7.0-20070801/Noun.name.csv
x mecab-ipadic-2.7.0-20070801/Noun.number.csv
x mecab-ipadic-2.7.0-20070801/Noun.org.csv
x mecab-ipadic-2.7.0-20070801/Noun.others.csv
x mecab-ipadic-2.7.0-20070801/Noun.place.csv
x mecab-ipadic-2.7.0-20070801/Noun.proper.csv
x mecab-ipadic-2.7.0-20070801/Noun.verbal.csv
x mecab-ipadic-2.7.0-20070801/Others.csv
x mecab-ipadic-2.7.0-20070801/Postp-col.csv
x mecab-ipadic-2.7.0-20070801/Postp.csv
x mecab-ipadic-2.7.0-20070801/Prefix.csv
x mecab-ipadic-2.7.0-20070801/Suffix.csv
x mecab-ipadic-2.7.0-20070801/Symbol.csv
x mecab-ipadic-2.7.0-20070801/Verb.csv
x mecab-ipadic-2.7.0-20070801/char.def
x mecab-ipadic-2.7.0-20070801/feature.def
x mecab-ipadic-2.7.0-20070801/left-id.def
x mecab-ipadic-2.7.0-20070801/matrix.def
x mecab-ipadic-2.7.0-20070801/pos-id.def
x mecab-ipadic-2.7.0-20070801/rewrite.def
x mecab-ipadic-2.7.0-20070801/right-id.def
x mecab-ipadic-2.7.0-20070801/unk.def
x mecab-ipadic-2.7.0-20070801/dicrc
x mecab-ipadic-2.7.0-20070801/RESULT
[make-mecab-ipadic-NEologd] : Configure custom system dictionary on /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../build/mecab-ipadic-2.7.0-20070801-neologd-20200910
checking for a BSD-compatible install... /usr/bin/install -c
checking whether build environment is sane... yes
checking whether make sets $(MAKE)... yes
checking for working aclocal-1.4... missing
checking for working autoconf... found
checking for working automake-1.4... missing
checking for working autoheader... found
checking for working makeinfo... found
checking for a BSD-compatible install... /usr/bin/install -c
checking for mecab-config... /usr/local/bin/mecab-config
configure: creating ./config.status
config.status: creating Makefile
[make-mecab-ipadic-NEologd] : Encode the character encoding of system dictionary resources from EUC_JP to UTF-8
./../../libexec/iconv_euc_to_utf8.sh ./Noun.place.csv
./../../libexec/iconv_euc_to_utf8.sh ./Auxil.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.verbal.csv
./../../libexec/iconv_euc_to_utf8.sh ./Symbol.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.org.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.csv
./../../libexec/iconv_euc_to_utf8.sh ./Postp.csv
./../../libexec/iconv_euc_to_utf8.sh ./Adj.csv
./../../libexec/iconv_euc_to_utf8.sh ./Filler.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.proper.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.number.csv
./../../libexec/iconv_euc_to_utf8.sh ./Suffix.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.others.csv
./../../libexec/iconv_euc_to_utf8.sh ./Interjection.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.adjv.csv
./../../libexec/iconv_euc_to_utf8.sh ./Verb.csv
./../../libexec/iconv_euc_to_utf8.sh ./Others.csv
./../../libexec/iconv_euc_to_utf8.sh ./Adnominal.csv
./../../libexec/iconv_euc_to_utf8.sh ./Prefix.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.demonst.csv
./../../libexec/iconv_euc_to_utf8.sh ./Adverb.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.name.csv
./../../libexec/iconv_euc_to_utf8.sh ./Postp-col.csv
./../../libexec/iconv_euc_to_utf8.sh ./Conjunction.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.nai.csv
./../../libexec/iconv_euc_to_utf8.sh ./Noun.adverbal.csv
rm ./Noun.place.csv
rm ./Auxil.csv
rm ./Noun.verbal.csv
rm ./Symbol.csv
rm ./Noun.org.csv
rm ./Noun.csv
rm ./Postp.csv
rm ./Adj.csv
rm ./Filler.csv
rm ./Noun.proper.csv
rm ./Noun.number.csv
rm ./Suffix.csv
rm ./Noun.others.csv
rm ./Interjection.csv
rm ./Noun.adjv.csv
rm ./Verb.csv
rm ./Others.csv
rm ./Adnominal.csv
rm ./Prefix.csv
rm ./Noun.demonst.csv
rm ./Adverb.csv
rm ./Noun.name.csv
rm ./Postp-col.csv
rm ./Conjunction.csv
rm ./Noun.nai.csv
rm ./Noun.adverbal.csv
./../../libexec/iconv_euc_to_utf8.sh ./rewrite.def
./../../libexec/iconv_euc_to_utf8.sh ./matrix.def
./../../libexec/iconv_euc_to_utf8.sh ./left-id.def
./../../libexec/iconv_euc_to_utf8.sh ./pos-id.def
./../../libexec/iconv_euc_to_utf8.sh ./unk.def
./../../libexec/iconv_euc_to_utf8.sh ./feature.def
./../../libexec/iconv_euc_to_utf8.sh ./right-id.def
./../../libexec/iconv_euc_to_utf8.sh ./char.def
rm ./rewrite.def
rm ./matrix.def
rm ./left-id.def
rm ./pos-id.def
rm ./unk.def
rm ./feature.def
rm ./right-id.def
rm ./char.def
mv ./Noun.others.csv.utf8 ./Noun.others.csv
mv ./Noun.number.csv.utf8 ./Noun.number.csv
mv ./Filler.csv.utf8 ./Filler.csv
mv ./Others.csv.utf8 ./Others.csv
mv ./unk.def.utf8 ./unk.def
mv ./Postp-col.csv.utf8 ./Postp-col.csv
mv ./Adnominal.csv.utf8 ./Adnominal.csv
mv ./Noun.verbal.csv.utf8 ./Noun.verbal.csv
mv ./matrix.def.utf8 ./matrix.def
mv ./Noun.csv.utf8 ./Noun.csv
mv ./Noun.demonst.csv.utf8 ./Noun.demonst.csv
mv ./char.def.utf8 ./char.def
mv ./Symbol.csv.utf8 ./Symbol.csv
mv ./Auxil.csv.utf8 ./Auxil.csv
mv ./Noun.name.csv.utf8 ./Noun.name.csv
mv ./feature.def.utf8 ./feature.def
mv ./Suffix.csv.utf8 ./Suffix.csv
mv ./Adverb.csv.utf8 ./Adverb.csv
mv ./Conjunction.csv.utf8 ./Conjunction.csv
mv ./pos-id.def.utf8 ./pos-id.def
mv ./Postp.csv.utf8 ./Postp.csv
mv ./right-id.def.utf8 ./right-id.def
mv ./Noun.nai.csv.utf8 ./Noun.nai.csv
mv ./Interjection.csv.utf8 ./Interjection.csv
mv ./Prefix.csv.utf8 ./Prefix.csv
mv ./Noun.place.csv.utf8 ./Noun.place.csv
mv ./Noun.adjv.csv.utf8 ./Noun.adjv.csv
mv ./rewrite.def.utf8 ./rewrite.def
mv ./Verb.csv.utf8 ./Verb.csv
mv ./left-id.def.utf8 ./left-id.def
mv ./Noun.proper.csv.utf8 ./Noun.proper.csv
mv ./Adj.csv.utf8 ./Adj.csv
mv ./Noun.adverbal.csv.utf8 ./Noun.adverbal.csv
mv ./Noun.org.csv.utf8 ./Noun.org.csv
[make-mecab-ipadic-NEologd] : Fix yomigana field of IPA dictionary
patching file Noun.csv
patching file Noun.place.csv
patching file Verb.csv
patching file Noun.verbal.csv
patching file Noun.name.csv
patching file Noun.adverbal.csv
patching file Noun.csv
patching file Noun.name.csv
patching file Noun.org.csv
patching file Noun.others.csv
patching file Noun.place.csv
patching file Noun.proper.csv
patching file Noun.verbal.csv
patching file Prefix.csv
patching file Suffix.csv
patching file Noun.proper.csv
patching file Noun.csv
patching file Noun.name.csv
patching file Noun.org.csv
patching file Noun.place.csv
patching file Noun.proper.csv
patching file Noun.verbal.csv
patching file Noun.name.csv
patching file Noun.org.csv
patching file Noun.place.csv
patching file Noun.proper.csv
patching file Suffix.csv
patching file Noun.demonst.csv
patching file Noun.csv
patching file Noun.name.csv
[make-mecab-ipadic-NEologd] : Copy user dictionary resource
[make-mecab-ipadic-NEologd] : Install adverb entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-adverb-dict-seed.20150623.csv.xz
[make-mecab-ipadic-NEologd] : Install interjection entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-interjection-dict-seed.20170216.csv.xz
[make-mecab-ipadic-NEologd] : Install noun orthographic variant entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-common-noun-ortho-variant-dict-seed.20170228.csv.xz
[make-mecab-ipadic-NEologd] : Install noun orthographic variant entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-proper-noun-ortho-variant-dict-seed.20161110.csv.xz
[make-mecab-ipadic-NEologd] : Install entries of orthographic variant of a noun used as verb form using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-noun-sahen-conn-ortho-variant-dict-seed.20160323.csv.xz
[make-mecab-ipadic-NEologd] : Install frequent adjective orthographic variant entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-adjective-std-dict-seed.20151126.csv.xz
[make-mecab-ipadic-NEologd] : Install infrequent adjective orthographic variant entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-adjective-exp-dict-seed.20151126.csv.xz
[make-mecab-ipadic-NEologd] : Install adjective verb orthographic variant entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-adjective-verb-dict-seed.20160324.csv.xz
[make-mecab-ipadic-NEologd] : Install infrequent datetime representation entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-date-time-infreq-dict-seed.20190415.csv.xz
[make-mecab-ipadic-NEologd] : Install infrequent quantity representation entries using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-quantity-infreq-dict-seed.20190415.csv.xz
[make-mecab-ipadic-NEologd] : Install entries of ill formed words using /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../seed/neologd-ill-formed-words-dict-seed.20170127.csv.xz
[make-mecab-ipadic-NEologd] : Re-Index system dictionary
reading ./unk.def ... 40
emitting double-array: 100% |###########################################| 
./model.def is not found. skipped.
reading ./neologd-adjective-verb-dict-seed.20160324.csv ... 20268
reading ./Noun.place.csv ... 73194
reading ./Auxil.csv ... 199
reading ./Noun.verbal.csv ... 12150
reading ./Symbol.csv ... 208
reading ./Noun.org.csv ... 17149
reading ./Noun.csv ... 60734
reading ./Postp.csv ... 146
reading ./neologd-ill-formed-words-dict-seed.20170127.csv ... 60616
reading ./Adj.csv ... 27210
reading ./Filler.csv ... 19
reading ./Noun.proper.csv ... 27493
reading ./Noun.number.csv ... 42
reading ./Suffix.csv ... 1448
reading ./mecab-user-dict-seed.20200910.csv ... 3224584
reading ./Noun.others.csv ... 153
reading ./Interjection.csv ... 252
reading ./Noun.adjv.csv ... 3328
reading ./Verb.csv ... 130750
reading ./neologd-date-time-infreq-dict-seed.20190415.csv ... 16866
reading ./neologd-proper-noun-ortho-variant-dict-seed.20161110.csv ... 138379
reading ./neologd-adjective-exp-dict-seed.20151126.csv ... 1051146
reading ./Others.csv ... 2
reading ./Adnominal.csv ... 135
reading ./neologd-common-noun-ortho-variant-dict-seed.20170228.csv ... 152869
reading ./neologd-quantity-infreq-dict-seed.20190415.csv ... 229216
reading ./neologd-noun-sahen-conn-ortho-variant-dict-seed.20160323.csv ... 26058
reading ./neologd-adjective-std-dict-seed.20151126.csv ... 507812
reading ./Prefix.csv ... 224
reading ./Noun.demonst.csv ... 120
reading ./Adverb.csv ... 3032
reading ./neologd-adverb-dict-seed.20150623.csv ... 139792
reading ./neologd-interjection-dict-seed.20170216.csv ... 4701
reading ./Noun.name.csv ... 34215
reading ./Postp-col.csv ... 91
reading ./Conjunction.csv ... 171
reading ./Noun.nai.csv ... 42
reading ./Noun.adverbal.csv ... 808
emitting double-array: 100% |###########################################| 
reading ./matrix.def ... 1316x1316
emitting matrix      : 100% |###########################################| 

done!
[make-mecab-ipadic-NEologd] : Make custom system dictionary on /usr/local/lib/mecab/dic/mecab-ipadic-neologd/libexec/../build/mecab-ipadic-2.7.0-20070801-neologd-20200910
make: Nothing to be done for `all'.
[make-mecab-ipadic-NEologd] : Finish..
[install-mecab-ipadic-NEologd] : Get results of tokenize test
[test-mecab-ipadic-NEologd] : Start..
[test-mecab-ipadic-NEologd] : Replace timestamp from 'git clone' date to 'git commit' date
[test-mecab-ipadic-NEologd] : Get buzz phrases
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 31978    0 31978    0     0   127k      0 --:--:-- --:--:-- --:--:--  127k
[test-mecab-ipadic-NEologd] : Get difference between default system dictionary and mecab-ipadic-NEologd
[test-mecab-ipadic-NEologd] : Something wrong. You shouldn't install mecab-ipadic-NEologd yet.
[test-mecab-ipadic-NEologd] : Finish..

[install-mecab-ipadic-NEologd] : Please check the list of differences in the upper part.

[install-mecab-ipadic-NEologd] : Do you want to install mecab-ipadic-NEologd? Type yes or no.
yes
[install-mecab-ipadic-NEologd] : OK. Let's install mecab-ipadic-NEologd.
[install-mecab-ipadic-NEologd] : Start..
[install-mecab-ipadic-NEologd] : /usr/local/lib/mecab/dic is current user's directory
[install-mecab-ipadic-NEologd] : Make install to /usr/local/lib/mecab/dic/mecab-ipadic-neologd
make[1]: Nothing to be done for `install-exec-am'.
/bin/sh ./mkinstalldirs /usr/local/lib/mecab/dic/mecab-ipadic-neologd
 /usr/bin/install -c -m 644 ./matrix.bin /usr/local/lib/mecab/dic/mecab-ipadic-neologd/matrix.bin
 /usr/bin/install -c -m 644 ./char.bin /usr/local/lib/mecab/dic/mecab-ipadic-neologd/char.bin
 /usr/bin/install -c -m 644 ./sys.dic /usr/local/lib/mecab/dic/mecab-ipadic-neologd/sys.dic
 /usr/bin/install -c -m 644 ./unk.dic /usr/local/lib/mecab/dic/mecab-ipadic-neologd/unk.dic
 /usr/bin/install -c -m 644 ./left-id.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/left-id.def
 /usr/bin/install -c -m 644 ./right-id.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/right-id.def
 /usr/bin/install -c -m 644 ./rewrite.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/rewrite.def
 /usr/bin/install -c -m 644 ./pos-id.def /usr/local/lib/mecab/dic/mecab-ipadic-neologd/pos-id.def
 /usr/bin/install -c -m 644 ./dicrc /usr/local/lib/mecab/dic/mecab-ipadic-neologd/dicrc

[install-mecab-ipadic-NEologd] : Install completed.
[install-mecab-ipadic-NEologd] : When you use MeCab, you can set '/usr/local/lib/mecab/dic/mecab-ipadic-neologd' as a value of '-d' option of MeCab.
[install-mecab-ipadic-NEologd] : Usage of mecab-ipadic-NEologd is here.
Usage:
    $ mecab -d /usr/local/lib/mecab/dic/mecab-ipadic-neologd ...

[install-mecab-ipadic-NEologd] : Finish..
[install-mecab-ipadic-NEologd] : Finish..
%

2021/05/292021/12/22

【Mac】Python の CaboCha をインストールして係り受け解析を行う

Mac 環境で Python の CaboCha を使って係り受け解析を行う方法を紹介します。

MeCab、CRF++、CaboCha のインストール
- MeCab のインストール
- CRF++ と CaboCha のインストール
CaboCha を使ってみる（Python 経由ではない）
CaboCha の Python バインディング
- 新たに仮想環境を作った場合
Python で CaboCha を使う
- 係り受け関係の出力
- 形態素の出力
NEologd 辞書で新語対応
- NEologd 辞書のインストール
- NEologd 辞書を使う

1. MeCab、CRF++、CaboCha のインストール

まず MeCab、CRF++ そして CaboCha をインストールするので、Python 仮想環境を起動した状態で下記を実行します。

MeCab のインストール

% brew install mecab
% brew install mecab-ipadic
% pip install mecab-python3

CRF++ と CaboCha のインストール

% brew install crf++
% brew install cabocha

2. CaboCha を使ってみる（Python 経由ではない）

上記をインストールするとターミナルで直接であれば CaboCha が使える様になります。

コマンド「cabocha」を実行してそのまま「今日は良い天気ですね。」と入力すると下記の様に出力されます。

% cabocha
今日は良い天気ですね。
      今日は---D
          良い-D
    天気ですね。
EOS

ただ、ここまでは Python を立ち上げずに直接 Shellscript で CaboCha を使っただけです。

3. CaboCha の Python バインディング

cabocha-0.69.tar.bz2 のリンクがあるのでここからダウンロードします。

Downloads フォルダに圧縮ファイルがありますね。

% cd Users/ユーザー名/Downloads
% ls
cabocha-0.69.tar.bz2

ファイルを解凍して、configure、make、make install を行います。

% tar xfv cabocha-0.69.tar.bz2
% cd cabocha-0.69
% ./configure --prefix=/usr/local/cabocha/0_69 --with-charset=UTF8 --with-posset=IPA
% make
% make install

Python の仮想環境を立ち上げた状態で「cabocha-0.69」直下の「python」フォルダに移動して「sudo python setup.py install」を実行します。

% cd python
% sudo python setup.py install

こうすると import CaboCha できる様になります。

ただ、Downloads フォルダから cabocha-0.69 ファイルを削除してもできる意味をまだいまいち理解できていません。仮想環境の site-packages に CaboCha.py は作られたんですけどそれで間に合ってるんですかね。後でログをよくみてみます。。。

追加の仮想環境を作った場合

すでに一度上記の行程を経て CaboCha を使っている場合、新たに追加の仮想環境を作る際にはいくつか行程を飛ばすことができます。

解凍した「cabocha-0.69」がある状態で「cabocha-0.69/python」ディレクトリに入り、新たに作った仮想環境を起動し、下記を実行すれば OK です。

% pip install mecab-python3
% cd cabocha-0.69/python
% sudo python setup.py install

4. Python で CaboCha を使う

とりあえず Python を立ち上げて「import CaboCha」もできますし下記の処理も実行できました。

>>> import CaboCha
>>> c = CaboCha.Parser()
>>> sentence = '今日は良い天気ですね。'
>>> print(c.parseToString(sentence))
      今日は---D
          良い-D
    天気ですね。
EOS

係り受け関係の出力

>>> tree =  c.parse(sentence)
>>> print(tree.toString(CaboCha.FORMAT_TREE))
      今日は---D
          良い-D
    天気ですね。
EOS

>>> print(tree.toString(CaboCha.FORMAT_LATTICE))
* 0 2D 0/1 -1.140323
今日	名詞,副詞可能,*,*,*,*,今日,キョウ,キョー
は	助詞,係助詞,*,*,*,*,は,ハ,ワ
* 1 2D 0/0 -1.140323
良い	形容詞,自立,*,*,形容詞・アウオ段,基本形,良い,ヨイ,ヨイ
* 2 -1D 0/2 0.000000
天気	名詞,一般,*,*,*,*,天気,テンキ,テンキ
です	助動詞,*,*,*,特殊・デス,基本形,です,デス,デス
ね	助詞,終助詞,*,*,*,*,ね,ネ,ネ
。	記号,句点,*,*,*,*,。,。,。
EOS

形態素の出力

形態素の文字列

>>> for i in range(tree.size()):
...     print(tree.token(i).surface)
... 
今日
は
良い
天気
です
ね
。
>>>

形態素の情報

>>> for i in range(tree.size()):
...     print(tree.token(i).feature)
... 
名詞,副詞可能,*,*,*,*,今日,キョウ,キョー
助詞,係助詞,*,*,*,*,は,ハ,ワ
形容詞,自立,*,*,形容詞・アウオ段,基本形,良い,ヨイ,ヨイ
名詞,一般,*,*,*,*,天気,テンキ,テンキ
助動詞,*,*,*,特殊・デス,基本形,です,デス,デス
助詞,終助詞,*,*,*,*,ね,ネ,ネ
記号,句点,*,*,*,*,。,。,。
>>>

5. NEologd 辞書で新語対応

デフォルトでは「IPA 辞書」という辞書が使用されますが、新語に対応するには「NEologd 辞書」が多く使用されている様です。

NEologd 辞書のインストール

通常の辞書「ipadic」が格納されているディレクトリに移動します。おそらく「/usr/local/lib/mecab/dic」もしくはそれに似た場所にあると思います。

% /usr/local/lib/mecab/dic
% ls
ipadic

git clone で「mecab-ipadic-neologd」を作成します。

% git clone --depth 1 https://github.com/neologd/mecab-ipadic-neologd.git
% ls
ipadic			mecab-ipadic-neologd

「mecab-ipadic-neologd」フォルダに移動し、コマンド「./bin/install-mecab-ipadic-neologd -n -a」を実行します。

% cd mecab-ipadic-neologd
% ./bin/install-mecab-ipadic-neologd -n -a

途中「Do you want to install mecab-ipadic-NEologd? Type yes or no.」と聞かれるので「yes」と入力します。

これでインストール完了です。

NEologd 辞書を使う

CaboCha、MeCab を使用する際、デフォルトでは IPA 辞書が使用されるので、明示的に NEologd 辞書を指定する必要があります。

実行時に「-d /usr/local/lib/mecab/dic/mecab-ipadic-neologd」を渡すのですが、下記コードの様に「CaboCha.Parser(‘-d /usr/local/lib/mecab/dic/mecab-ipadic-neologd’)」としてあげれば OK です。

import CaboCha

sentence = '霜降り明星（しもふりみょうじょう）は、2018年『M-1グランプリ』14代目王者。'

# IPA 辞書
c = CaboCha.Parser()
print('IPA 辞書:')
print(c.parseToString(sentence))

# NEologd 辞書
c = CaboCha.Parser('-d /usr/local/lib/mecab/dic/mecab-ipadic-neologd')
print('NEologd 辞書:')
print(c.parseToString(sentence))

上記を実行すると下記のアウトプットが返ってきます。

IPA 辞書:
      霜降り明星---D            
            （しも-D            
              ふりみ-D          
      ょうじょう）は、---------D
                  2018年---D   |
                      『M--D   |
               1グランプリ』-D |
                        14代目-D
                          王者。
EOS

NEologd 辞書:
        霜降り明星-----D      
              （しも-D |      
                  ふり-D      
      みょうじょう）は、-----D
                    2018年-D |
           『M-1グランプリ』-D
                  14代目王者。
EOS

「しもふりみょうじょう」や「M-1グランプリ」の部分が若干違いますね。

若干応用編として、YouTube のコメント欄を取得して形態素解析をしてみましたので下記に貼っておきます。

▶︎【Mac】Python の MeCab で YouTube コメントを形態素解析にかける

2021/05/29

【ログ】macOS Big Sur 11.2.2: sudo python setup.py install（cabocha-0.69 の configure, make, make install 実行後）

実行コマンド：sudo python setup.py install
- cabocha-0.69 の configure, make, make install 実行後
実行日：2021/05/29
実行環境：macOS Big Sur 11.2.2

% sudo python3 setup.py install
running install
running build
running build_py
creating build
creating build/lib.macosx-10.9-x86_64-3.8
copying CaboCha.py -> build/lib.macosx-10.9-x86_64-3.8
running build_ext
building '_CaboCha' extension
creating build/temp.macosx-10.9-x86_64-3.8
gcc -Wno-unused-result -Wsign-compare -Wunreachable-code -fno-common -dynamic -DNDEBUG -g -fwrapv -O3 -Wall -arch x86_64 -g -I/usr/local/Cellar/cabocha/0.69/include -I/Users/ユーザー名/仮想環境ディレクトリ/include -I/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8 -c CaboCha_wrap.cxx -o build/temp.macosx-10.9-x86_64-3.8/CaboCha_wrap.o
In file included from CaboCha_wrap.cxx:154:
In file included from /Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/Python.h:85:
In file included from /Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/pytime.h:6:
In file included from /Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/object.h:746:
/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/cpython/object.h:177:16: warning: 'tp_print' is deprecated [-Wdeprecated-declarations]
typedef struct _typeobject {
               ^
CaboCha_wrap.cxx:1947:23: note: in implicit copy assignment operator for '_typeobject' first required here
    swigpyobject_type = tmp;
                      ^
/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/cpython/object.h:260:5: note: 'tp_print' has been explicitly marked deprecated here
    Py_DEPRECATED(3.8) int (*tp_print)(PyObject *, FILE *, int);
    ^
/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/pyport.h:515:54: note: expanded from macro 'Py_DEPRECATED'
#define Py_DEPRECATED(VERSION_UNUSED) __attribute__((__deprecated__))
                                                     ^
In file included from CaboCha_wrap.cxx:154:
In file included from /Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/Python.h:85:
In file included from /Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/pytime.h:6:
In file included from /Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/object.h:746:
/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/cpython/object.h:177:16: warning: 'tp_print' is deprecated [-Wdeprecated-declarations]
typedef struct _typeobject {
               ^
/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/cpython/object.h:260:5: note: 'tp_print' has been explicitly marked deprecated here
    Py_DEPRECATED(3.8) int (*tp_print)(PyObject *, FILE *, int);
    ^
/Library/Frameworks/Python.framework/Versions/3.8/include/python3.8/pyport.h:515:54: note: expanded from macro 'Py_DEPRECATED'
#define Py_DEPRECATED(VERSION_UNUSED) __attribute__((__deprecated__))
                                                     ^
CaboCha_wrap.cxx:3669:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:3666:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:3871:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:3868:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:3919:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:3916:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:3953:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:3950:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:3985:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:3982:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4026:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4023:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4067:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4064:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4118:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4115:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4152:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4149:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4183:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4180:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4214:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4211:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4246:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4243:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4278:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4275:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4310:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4307:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4351:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4348:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4383:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4380:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4423:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4420:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4455:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4452:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4495:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4492:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4527:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4524:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4567:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4564:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4599:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4596:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4622:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4619:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4653:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4650:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4702:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4699:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4746:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4743:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4789:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4786:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4868:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4865:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4891:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4888:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4922:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4919:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4955:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4952:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:4980:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:4977:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:5041:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:5038:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:5130:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:5127:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:5225:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:5222:12: note: for type 'char *'
    catch (char *e) {
           ^
CaboCha_wrap.cxx:5320:18: warning: exception of type 'const char *' will be caught by earlier handler [-Wexceptions]
    catch (const char *e) {
                 ^
CaboCha_wrap.cxx:5317:12: note: for type 'char *'
    catch (char *e) {
           ^
38 warnings generated.
warning: no library file corresponding to '-L/usr/local/Cellar/mecab/0.996/lib' found (skipping)
g++ -bundle -undefined dynamic_lookup -arch x86_64 -g build/temp.macosx-10.9-x86_64-3.8/CaboCha_wrap.o -L/usr/local/Cellar/cabocha/0.69/lib -lcabocha -lcrfpp -lmecab -liconv -lmecab -lstdc++ -o build/lib.macosx-10.9-x86_64-3.8/_CaboCha.cpython-38-darwin.so
ld: warning: dylib (/usr/local/Cellar/cabocha/0.69/lib/libcabocha.dylib) was built for newer macOS version (11.0) than being linked (10.9)
ld: warning: dylib (/usr/local/lib/libcrfpp.dylib) was built for newer macOS version (11.0) than being linked (10.9)
ld: warning: dylib (/usr/local/lib/libmecab.dylib) was built for newer macOS version (11.0) than being linked (10.9)
running install_lib
copying build/lib.macosx-10.9-x86_64-3.8/_CaboCha.cpython-38-darwin.so -> /Users/ユーザー名/仮想環境ディレクトリ/lib/python3.8/site-packages
copying build/lib.macosx-10.9-x86_64-3.8/CaboCha.py -> /Users/ユーザー名/仮想環境ディレクトリ/lib/python3.8/site-packages
byte-compiling /Users/ユーザー名/仮想環境ディレクトリ/lib/python3.8/site-packages/CaboCha.py to CaboCha.cpython-38.pyc
running install_egg_info
Writing /Users/ユーザー名/仮想環境ディレクトリ/lib/python3.8/site-packages/cabocha_python-0.69-py3.8.egg-info
%

2021/05/29

【ログ】macOS Big Sur 11.2.2: tar xfv cabocha-0.69.tar.bz2

実行コマンド：tar xfv cabocha-0.69.tar.bz2
実行日：2021/05/29
実行環境：macOS Big Sur 11.2.2

% tar xfv cabocha-0.69.tar.bz2
x cabocha-0.69/
x cabocha-0.69/cabocha-config.in
x cabocha-0.69/compile
x cabocha-0.69/swig/
x cabocha-0.69/swig/version.h.in
x cabocha-0.69/swig/Makefile
x cabocha-0.69/swig/version.h
x cabocha-0.69/swig/CaboCha.i
x cabocha-0.69/missing
x cabocha-0.69/java/
x cabocha-0.69/java/test.java
x cabocha-0.69/java/Makefile
x cabocha-0.69/java/org/
x cabocha-0.69/java/org/chasen/
x cabocha-0.69/java/org/chasen/cabocha/
x cabocha-0.69/java/org/chasen/cabocha/FormatType.java
x cabocha-0.69/java/org/chasen/cabocha/OutputLayerType.java
x cabocha-0.69/java/org/chasen/cabocha/Token.java
x cabocha-0.69/java/org/chasen/cabocha/CaboChaConstants.java
x cabocha-0.69/java/org/chasen/cabocha/ParserType.java
x cabocha-0.69/java/org/chasen/cabocha/ParsingAlgorithm.java
x cabocha-0.69/java/org/chasen/cabocha/Chunk.java
x cabocha-0.69/java/org/chasen/cabocha/InputLayerType.java
x cabocha-0.69/java/org/chasen/cabocha/CaboCha.java
x cabocha-0.69/java/org/chasen/cabocha/CaboChaJNI.java
x cabocha-0.69/java/org/chasen/cabocha/PossetType.java
x cabocha-0.69/java/org/chasen/cabocha/Tree.java
x cabocha-0.69/java/org/chasen/cabocha/CharsetType.java
x cabocha-0.69/java/org/chasen/cabocha/Parser.java
x cabocha-0.69/java/CaboCha_wrap.cxx
x cabocha-0.69/ltmain.sh
x cabocha-0.69/config.guess
x cabocha-0.69/man/
x cabocha-0.69/man/Makefile.in
x cabocha-0.69/man/cabocha.1
x cabocha-0.69/man/Makefile.am
x cabocha-0.69/BSD
x cabocha-0.69/python/
x cabocha-0.69/python/test.py
x cabocha-0.69/python/CaboCha.py
x cabocha-0.69/python/CaboCha_wrap.cxx
x cabocha-0.69/python/setup.py
x cabocha-0.69/AUTHORS
x cabocha-0.69/ruby/
x cabocha-0.69/ruby/CaboCha_wrap.cpp
x cabocha-0.69/ruby/extconf.rb
x cabocha-0.69/ruby/test.rb
x cabocha-0.69/Makefile.in
x cabocha-0.69/NEWS
x cabocha-0.69/install-sh
x cabocha-0.69/cabocha.iss.in
x cabocha-0.69/ChangeLog
x cabocha-0.69/configure
x cabocha-0.69/src/
x cabocha-0.69/src/string_buffer.cpp
x cabocha-0.69/src/tree_allocator.cpp
x cabocha-0.69/src/dep.h
x cabocha-0.69/src/dep_learner.cpp
x cabocha-0.69/src/tree_allocator.h
x cabocha-0.69/src/svm.h
x cabocha-0.69/src/svm.cpp
x cabocha-0.69/src/ucstable.h
x cabocha-0.69/src/utils.h
x cabocha-0.69/src/selector.cpp
x cabocha-0.69/src/chunk_learner.cpp
x cabocha-0.69/src/string_buffer.h
x cabocha-0.69/src/ucs.cpp
x cabocha-0.69/src/ne.cpp
x cabocha-0.69/src/eval.cpp
x cabocha-0.69/src/cabocha.cpp
x cabocha-0.69/src/Makefile.in
x cabocha-0.69/src/scoped_ptr.h
x cabocha-0.69/src/chunker.h
x cabocha-0.69/src/normalizer.rule
x cabocha-0.69/src/common.h
x cabocha-0.69/src/normalizer_rule.sh
x cabocha-0.69/src/darts.h
x cabocha-0.69/src/learner.cpp
x cabocha-0.69/src/cabocha.h
x cabocha-0.69/src/morph.h
x cabocha-0.69/src/svm_learn.cpp
x cabocha-0.69/src/Makefile.msvc.in
x cabocha-0.69/src/timer.h
x cabocha-0.69/src/chunker.cpp
x cabocha-0.69/src/utils.cpp
x cabocha-0.69/src/param.h
x cabocha-0.69/src/winmain.h
x cabocha-0.69/src/normalizer.h
x cabocha-0.69/src/param.cpp
x cabocha-0.69/src/parser.cpp
x cabocha-0.69/src/ne.h
x cabocha-0.69/src/normalizer_rule.h
x cabocha-0.69/src/svm_learn.h
x cabocha-0.69/src/ucs.h
x cabocha-0.69/src/cabocha-model-index.cpp
x cabocha-0.69/src/mmap.h
x cabocha-0.69/src/analyzer.h
x cabocha-0.69/src/make.bat
x cabocha-0.69/src/tree.cpp
x cabocha-0.69/src/char_category.h
x cabocha-0.69/src/Makefile.am
x cabocha-0.69/src/dep.cpp
x cabocha-0.69/src/morph.cpp
x cabocha-0.69/src/selector_pat.h
x cabocha-0.69/src/cabocha-system-eval.cpp
x cabocha-0.69/src/cabocha-learn.cpp
x cabocha-0.69/src/stream_wrapper.h
x cabocha-0.69/src/selector.h
x cabocha-0.69/src/libcabocha.cpp
x cabocha-0.69/src/normalizer.cpp
x cabocha-0.69/src/freelist.h
x cabocha-0.69/perl/
x cabocha-0.69/perl/test.pl
x cabocha-0.69/perl/Makefile.PL
x cabocha-0.69/perl/CaboCha_wrap.o
x cabocha-0.69/perl/CaboCha.bs
x cabocha-0.69/perl/blib/
x cabocha-0.69/perl/blib/bin/
x cabocha-0.69/perl/blib/bin/.exists
x cabocha-0.69/perl/blib/arch/
x cabocha-0.69/perl/blib/arch/.exists
x cabocha-0.69/perl/blib/arch/auto/
x cabocha-0.69/perl/blib/arch/auto/CaboCha/
x cabocha-0.69/perl/blib/arch/auto/CaboCha/.exists
x cabocha-0.69/perl/blib/arch/auto/CaboCha/CaboCha.so
x cabocha-0.69/perl/blib/arch/auto/CaboCha/CaboCha.bs
x cabocha-0.69/perl/blib/lib/
x cabocha-0.69/perl/blib/lib/.exists
x cabocha-0.69/perl/blib/lib/auto/
x cabocha-0.69/perl/blib/lib/auto/CaboCha/
x cabocha-0.69/perl/blib/lib/auto/CaboCha/.exists
x cabocha-0.69/perl/blib/lib/CaboCha.pm
x cabocha-0.69/perl/blib/man1/
x cabocha-0.69/perl/blib/man1/.exists
x cabocha-0.69/perl/blib/script/
x cabocha-0.69/perl/blib/script/.exists
x cabocha-0.69/perl/blib/man3/
x cabocha-0.69/perl/blib/man3/.exists
x cabocha-0.69/perl/CaboCha_wrap.cxx
x cabocha-0.69/perl/pm_to_blib
x cabocha-0.69/perl/CaboCha.pm
x cabocha-0.69/perl/MYMETA.yml
x cabocha-0.69/config.rpath
x cabocha-0.69/TODO
x cabocha-0.69/configure.in
x cabocha-0.69/config.sub
x cabocha-0.69/LGPL
x cabocha-0.69/tools/
x cabocha-0.69/tools/kc2cabocha.pl
x cabocha-0.69/tools/irex2cabocha.pl
x cabocha-0.69/tools/chasen2mecab.pl
x cabocha-0.69/tools/kc2juman.pl
x cabocha-0.69/tools/KyotoCorpus.pm
x cabocha-0.69/tools/KNBC2KC.pl
x cabocha-0.69/cabocharc.in
x cabocha-0.69/INSTALL
x cabocha-0.69/aclocal.m4
x cabocha-0.69/README
x cabocha-0.69/config.h.in
x cabocha-0.69/COPYING
x cabocha-0.69/example/
x cabocha-0.69/example/example2.cpp
x cabocha-0.69/example/example.c
x cabocha-0.69/Makefile.am
x cabocha-0.69/model/
x cabocha-0.69/model/dep.ipa.txt
x cabocha-0.69/model/ne.juman.txt
x cabocha-0.69/model/dep.juman.txt
x cabocha-0.69/model/Makefile.in
x cabocha-0.69/model/dep.unidic.txt
x cabocha-0.69/model/chunk.ipa.txt
x cabocha-0.69/model/chunk.unidic.txt
x cabocha-0.69/model/ne.ipa.txt
x cabocha-0.69/model/ne.unidic.txt
x cabocha-0.69/model/chunk.juman.txt
x cabocha-0.69/model/Makefile.am
x cabocha-0.69/doc/
x cabocha-0.69/doc/README.txt
x cabocha-0.69/doc/doxygen/
x cabocha-0.69/doc/doxygen/classes.html
x cabocha-0.69/doc/doxygen/ftv2plastnode.png
x cabocha-0.69/doc/doxygen/nav_g.png
x cabocha-0.69/doc/doxygen/files.html
x cabocha-0.69/doc/doxygen/tab_b.gif
x cabocha-0.69/doc/doxygen/nav_h.png
x cabocha-0.69/doc/doxygen/namespaceCaboCha.html
x cabocha-0.69/doc/doxygen/functions_vars.html
x cabocha-0.69/doc/doxygen/tab_s.png
x cabocha-0.69/doc/doxygen/namespacemembers_eval.html
x cabocha-0.69/doc/doxygen/ftv2pnode.png
x cabocha-0.69/doc/doxygen/cabocha_8h.html
x cabocha-0.69/doc/doxygen/open.png
x cabocha-0.69/doc/doxygen/globals_func.html
x cabocha-0.69/doc/doxygen/structcabocha__token__t.html
x cabocha-0.69/doc/doxygen/doxygen.css
x cabocha-0.69/doc/doxygen/ftv2node.png
x cabocha-0.69/doc/doxygen/functions_func.html
x cabocha-0.69/doc/doxygen/ftv2mnode.png
x cabocha-0.69/doc/doxygen/ftv2doc.png
x cabocha-0.69/doc/doxygen/globals_enum.html
x cabocha-0.69/doc/doxygen/classCaboCha_1_1Tree.html
x cabocha-0.69/doc/doxygen/functions.html
x cabocha-0.69/doc/doxygen/ftv2folderopen.png
x cabocha-0.69/doc/doxygen/namespacemembers.html
x cabocha-0.69/doc/doxygen/globals.html
x cabocha-0.69/doc/doxygen/ftv2link.png
x cabocha-0.69/doc/doxygen/ftv2folderclosed.png
x cabocha-0.69/doc/doxygen/structcabocha__token__t-members.html
x cabocha-0.69/doc/doxygen/bdwn.png
x cabocha-0.69/doc/doxygen/namespacemembers_func.html
x cabocha-0.69/doc/doxygen/structcabocha__chunk__t.html
x cabocha-0.69/doc/doxygen/bc_s.png
x cabocha-0.69/doc/doxygen/cabocha_8h_source.html
x cabocha-0.69/doc/doxygen/globals_eval.html
x cabocha-0.69/doc/doxygen/ftv2mo.png
x cabocha-0.69/doc/doxygen/doxygen.png
x cabocha-0.69/doc/doxygen/index.html
x cabocha-0.69/doc/doxygen/tab_b.png
x cabocha-0.69/doc/doxygen/closed.png
x cabocha-0.69/doc/doxygen/nav_f.png
x cabocha-0.69/doc/doxygen/ftv2lastnode.png
x cabocha-0.69/doc/doxygen/classCaboCha_1_1Tree-members.html
x cabocha-0.69/doc/doxygen/tabs.css
x cabocha-0.69/doc/doxygen/ftv2vertline.png
x cabocha-0.69/doc/doxygen/ftv2cl.png
x cabocha-0.69/doc/doxygen/tab_h.png
x cabocha-0.69/doc/doxygen/globals_type.html
x cabocha-0.69/doc/doxygen/structcabocha__chunk__t-members.html
x cabocha-0.69/doc/doxygen/globals_defs.html
x cabocha-0.69/doc/doxygen/annotated.html
x cabocha-0.69/doc/doxygen/namespacemembers_type.html
x cabocha-0.69/doc/doxygen/tab_l.gif
x cabocha-0.69/doc/doxygen/tab_a.png
x cabocha-0.69/doc/doxygen/sync_off.png
x cabocha-0.69/doc/doxygen/ftv2ns.png
x cabocha-0.69/doc/doxygen/tab_r.gif
x cabocha-0.69/doc/doxygen/classCaboCha_1_1Parser-members.html
x cabocha-0.69/doc/doxygen/ftv2splitbar.png
x cabocha-0.69/doc/doxygen/ftv2mlastnode.png
x cabocha-0.69/doc/doxygen/classCaboCha_1_1Parser.html
x cabocha-0.69/doc/doxygen/namespaces.html
x cabocha-0.69/doc/doxygen/sync_on.png
x cabocha-0.69/doc/doxygen/namespacemembers_enum.html
x cabocha-0.69/doc/doxygen/dir_68267d1309a1af8e8297ef4c3efbcdba.html
x cabocha-0.69/doc/doxygen/dynsections.js
x cabocha-0.69/doc/doxygen/ftv2blank.png
x cabocha-0.69/doc/cabocha.cfg
%

ちなみにファイルの中身

% cd cabocha-0.69
cabocha-0.69 % tree
.
├── AUTHORS
├── BSD
├── COPYING
├── ChangeLog
├── INSTALL
├── LGPL
├── Makefile.am
├── Makefile.in
├── NEWS
├── README
├── TODO
├── aclocal.m4
├── cabocha-config.in
├── cabocha.iss.in
├── cabocharc.in
├── compile
├── config.guess
├── config.h.in
├── config.rpath
├── config.sub
├── configure
├── configure.in
├── doc
│   ├── README.txt
│   ├── cabocha.cfg
│   └── doxygen
│       ├── annotated.html
│       ├── bc_s.png
│       ├── bdwn.png
│       ├── cabocha_8h.html
│       ├── cabocha_8h_source.html
│       ├── classCaboCha_1_1Parser-members.html
│       ├── classCaboCha_1_1Parser.html
│       ├── classCaboCha_1_1Tree-members.html
│       ├── classCaboCha_1_1Tree.html
│       ├── classes.html
│       ├── closed.png
│       ├── dir_68267d1309a1af8e8297ef4c3efbcdba.html
│       ├── doxygen.css
│       ├── doxygen.png
│       ├── dynsections.js
│       ├── files.html
│       ├── ftv2blank.png
│       ├── ftv2cl.png
│       ├── ftv2doc.png
│       ├── ftv2folderclosed.png
│       ├── ftv2folderopen.png
│       ├── ftv2lastnode.png
│       ├── ftv2link.png
│       ├── ftv2mlastnode.png
│       ├── ftv2mnode.png
│       ├── ftv2mo.png
│       ├── ftv2node.png
│       ├── ftv2ns.png
│       ├── ftv2plastnode.png
│       ├── ftv2pnode.png
│       ├── ftv2splitbar.png
│       ├── ftv2vertline.png
│       ├── functions.html
│       ├── functions_func.html
│       ├── functions_vars.html
│       ├── globals.html
│       ├── globals_defs.html
│       ├── globals_enum.html
│       ├── globals_eval.html
│       ├── globals_func.html
│       ├── globals_type.html
│       ├── index.html
│       ├── namespaceCaboCha.html
│       ├── namespacemembers.html
│       ├── namespacemembers_enum.html
│       ├── namespacemembers_eval.html
│       ├── namespacemembers_func.html
│       ├── namespacemembers_type.html
│       ├── namespaces.html
│       ├── nav_f.png
│       ├── nav_g.png
│       ├── nav_h.png
│       ├── open.png
│       ├── structcabocha__chunk__t-members.html
│       ├── structcabocha__chunk__t.html
│       ├── structcabocha__token__t-members.html
│       ├── structcabocha__token__t.html
│       ├── sync_off.png
│       ├── sync_on.png
│       ├── tab_a.png
│       ├── tab_b.gif
│       ├── tab_b.png
│       ├── tab_h.png
│       ├── tab_l.gif
│       ├── tab_r.gif
│       ├── tab_s.png
│       └── tabs.css
├── example
│   ├── example.c
│   └── example2.cpp
├── install-sh
├── java
│   ├── CaboCha_wrap.cxx
│   ├── Makefile
│   ├── org
│   │   └── chasen
│   │       └── cabocha
│   │           ├── CaboCha.java
│   │           ├── CaboChaConstants.java
│   │           ├── CaboChaJNI.java
│   │           ├── CharsetType.java
│   │           ├── Chunk.java
│   │           ├── FormatType.java
│   │           ├── InputLayerType.java
│   │           ├── OutputLayerType.java
│   │           ├── Parser.java
│   │           ├── ParserType.java
│   │           ├── ParsingAlgorithm.java
│   │           ├── PossetType.java
│   │           ├── Token.java
│   │           └── Tree.java
│   └── test.java
├── ltmain.sh
├── man
│   ├── Makefile.am
│   ├── Makefile.in
│   └── cabocha.1
├── missing
├── model
│   ├── Makefile.am
│   ├── Makefile.in
│   ├── chunk.ipa.txt
│   ├── chunk.juman.txt
│   ├── chunk.unidic.txt
│   ├── dep.ipa.txt
│   ├── dep.juman.txt
│   ├── dep.unidic.txt
│   ├── ne.ipa.txt
│   ├── ne.juman.txt
│   └── ne.unidic.txt
├── perl
│   ├── CaboCha.bs
│   ├── CaboCha.pm
│   ├── CaboCha_wrap.cxx
│   ├── CaboCha_wrap.o
│   ├── MYMETA.yml
│   ├── Makefile.PL
│   ├── blib
│   │   ├── arch
│   │   │   └── auto
│   │   │       └── CaboCha
│   │   │           ├── CaboCha.bs
│   │   │           └── CaboCha.so
│   │   ├── bin
│   │   ├── lib
│   │   │   ├── CaboCha.pm
│   │   │   └── auto
│   │   │       └── CaboCha
│   │   ├── man1
│   │   ├── man3
│   │   └── script
│   ├── pm_to_blib
│   └── test.pl
├── python
│   ├── CaboCha.py
│   ├── CaboCha_wrap.cxx
│   ├── setup.py
│   └── test.py
├── ruby
│   ├── CaboCha_wrap.cpp
│   ├── extconf.rb
│   └── test.rb
├── src
│   ├── Makefile.am
│   ├── Makefile.in
│   ├── Makefile.msvc.in
│   ├── analyzer.h
│   ├── cabocha-learn.cpp
│   ├── cabocha-model-index.cpp
│   ├── cabocha-system-eval.cpp
│   ├── cabocha.cpp
│   ├── cabocha.h
│   ├── char_category.h
│   ├── chunk_learner.cpp
│   ├── chunker.cpp
│   ├── chunker.h
│   ├── common.h
│   ├── darts.h
│   ├── dep.cpp
│   ├── dep.h
│   ├── dep_learner.cpp
│   ├── eval.cpp
│   ├── freelist.h
│   ├── learner.cpp
│   ├── libcabocha.cpp
│   ├── make.bat
│   ├── mmap.h
│   ├── morph.cpp
│   ├── morph.h
│   ├── ne.cpp
│   ├── ne.h
│   ├── normalizer.cpp
│   ├── normalizer.h
│   ├── normalizer.rule
│   ├── normalizer_rule.h
│   ├── normalizer_rule.sh
│   ├── param.cpp
│   ├── param.h
│   ├── parser.cpp
│   ├── scoped_ptr.h
│   ├── selector.cpp
│   ├── selector.h
│   ├── selector_pat.h
│   ├── stream_wrapper.h
│   ├── string_buffer.cpp
│   ├── string_buffer.h
│   ├── svm.cpp
│   ├── svm.h
│   ├── svm_learn.cpp
│   ├── svm_learn.h
│   ├── timer.h
│   ├── tree.cpp
│   ├── tree_allocator.cpp
│   ├── tree_allocator.h
│   ├── ucs.cpp
│   ├── ucs.h
│   ├── ucstable.h
│   ├── utils.cpp
│   ├── utils.h
│   └── winmain.h
├── swig
│   ├── CaboCha.i
│   ├── Makefile
│   ├── version.h
│   └── version.h.in
└── tools
    ├── KNBC2KC.pl
    ├── KyotoCorpus.pm
    ├── chasen2mecab.pl
    ├── irex2cabocha.pl
    ├── kc2cabocha.pl
    └── kc2juman.pl

26 directories, 212 files
cabocha-0.69 %

2021/05/29

【ログ】macOS Big Sur 11.2.2: brew install cabocha

実行コマンド：brew install cabocha
実行日：2021/05/29
実行環境：macOS Big Sur 11.2.2

% brew install cabocha
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 1 tap (homebrew/core).
==> Updated Formulae
Updated 1 formula.

==> Downloading https://ghcr.io/v2/homebrew/core/cabocha/manifests/0.69-1
######################################################################## 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/cabocha/blobs/sha256:1dd5c1474946aaab675326323c8f7e3d101687b50d5542464558f54a8c477cc8
==> Downloading from https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:1dd5c1474946aaab675326323c8f7e3d101687b50d5542464558f54a8c477cc8?se=2021-05-28T21%3A35%3A00Z&sig=LU2t3QBPVMTA
######################################################################## 100.0%
==> Pouring cabocha--0.69.big_sur.bottle.1.tar.gz
🍺  /usr/local/Cellar/cabocha/0.69: 28 files, 236.2MB
%

2021/05/29

【ログ】macOS Big Sur 11.2.2: brew install crf++

実行コマンド：brew install crf++
実行日：2021/05/29
実行環境：macOS Big Sur 11.2.2

% brew install crf++
Updating Homebrew...
==> Auto-updated Homebrew!
Updated 2 taps (homebrew/core and homebrew/cask).
==> New Formulae
caire                                  cidr2range                             qthreads                               range2cidr                             universal-ctags
==> Updated Formulae
Updated 344 formulae.
==> Renamed Formulae
badtouch -> authoscope
==> New Casks
assinador-serpro                                 dmidiplayer                                      futurerestore-gui                                hightop
==> Updated Casks
Updated 197 casks.

==> Downloading https://ghcr.io/v2/homebrew/core/crfxx/manifests/0.58-3
######################################################################## 100.0%
==> Downloading https://ghcr.io/v2/homebrew/core/crfxx/blobs/sha256:fcf0862271c392bc7b69a4e02a74dd9bd85615b6be0273009e7611bb78298f61
==> Downloading from https://pkg-containers.githubusercontent.com/ghcr1/blobs/sha256:fcf0862271c392bc7b69a4e02a74dd9bd85615b6be0273009e7611bb78298f61?se=2021-05-28T21%3A25%3A00Z&sig=fBS3Lw84FQ6O
######################################################################## 100.0%
==> Pouring crf++--0.58.big_sur.bottle.3.tar.gz
🍺  /usr/local/Cellar/crf++/0.58: 13 files, 765.2KB
%