20231228

2023/12/28 21:49

. . .

ハアハアハアハアハアハア
iPad版のクリスタを買ってしまったこと生涯後悔し続けるだろう。使いづらすぎる。なぜ大人しく液タブにしなかったんだろう。悔いというには身がちぎれるような悔い。
4ドーロで4コマ程度(色塗り抜き)なので、色や台詞まで込みで考えると1コマ2ドーロくらいかかりそうだな。1コマに50分？うせやろ……？18ページの漫画だから……コマ数いくつだ？1ページ平均4くらいとしても……。……地獄だ……。

「もう助からないけどすぐには死ねない状態で『すぐに救護騎士団が来ますからね！』って致命傷だってわかってても口にせずにはいられないマリーに介錯を頼みたい」ってコメント天才だと思う。よくもそんな人の心がないシチュエーションが思いつくな。素晴らしい。誰か書いてください。

Me irl

https://www.youtube.com/watch?v=khplMpm4ctc

マジでいい。1:10あたり。「な」が掠れてるんですよね。掠れてるんだけど出てるのよ。出ててビブラートかかってんのよ。
そのあとの「見つけないで」の三連符っぽい歌い方も好き。

くだんのリーダーボードですが、ERPの方の結果も別ページで公開されていた。
どうも機械評価らしく、repetition度とかをほぼ・あるいは全く考慮していないっぽいのでやはり問題がある。alphav2が最強だッ！依然変わりなく！！

https://nazology.net/archives/142135/3

オーバーロードきたな……。
別の枝のエネルギーを奪えるなら資源問題は解決じゃん。いや、今度は我々が奪われるのか？

ベクトル化によるマルチモーダル検索システムを作るのにハマっている。Langchainでintfloat/multilingual-e5-largeを埋め込みモデルに、LLMモデルをLoyal-Macaroni-Maid-7Bあたりにして運用すると良い精度になる。
自分正直よくわかってないんですけど、マルチリンガルテキスト埋め込みモデルなら言語無視して既存のLLMを高精度に出来たりするのか？それともLoyal-Macaroni-Maid-7Bがすごいだけ？よくわかんにゃい。あとベクトル化した時にcontext lengthはどうなるのかとかもよくわからん。
これ、創作世界管理するのに死ぬほど役立ちそう。なんせフツーに質問するだけでObsidianのデータベースから回答してくれるから。

質問を入力: 回転遁地術とは？
回答: 回転遁地術は、通常の遁地術と異なり、「入り」と「止め」に回転をかけた縮地です。木乃葉が独楽の動きをヒントに独自に編み出した技術です。出だしに回転動作が入るため「入り」がわかりやすく、慣性移動が残るため、通常の遁地より止まりが遅くなるという弱点があります。一方で、「止め」の際に姿勢転換を同時に行えるので、攻撃に移るまでが早い。

質問を入力: 最強の女忍者は誰？
回答：最強の女忍者は雪乃である。

質問を入力：秘石とは？
回答：秘石は、特定の能力を持つ椎葉流の稀有な物質である。例えば、黄水晶は役に立たない能力を持っているため、実際に使う人は少ない。鹿野遊里の秘石には紅玉、琥珀、紫水晶、瑪瑙、金剛、翡翠があり、秘石には属性がある。秘石は神力を活かすための手段として使われ、忍者や妖怪がそれを利用して神術を行うことができる。

質問を入力：鹿野遊里はどこにある？
回答：椎葉村にある扇山の奥地に存在する、椎葉流忍者の本山で、忍者の里とも呼ばれています。

質問を入力：琥珀とは誰？
回答：琥珀は、人造妖怪であり、妖怪事変の時に人身御供となり人類庇護者として戦った者です。当時は「両面宿儺」と呼ばれており、最強の妖怪の名を担いで命名されたものであると述べられています。また、剣魂の制御は琥珀を通して行わていると言われています。

全部あってます。すごすぎる。これもう脳内物語の読者だろ。
欲を言えばどのノートから導き出した情報かというソースまで引っ張ってきてくれると最強になるんですけどね。ハルシネーションの可能性が常に付きまとうからソースは出してほしい。
あとたまにこんな感じに返答が英語になる。元モデルが英語メインだから当たり前といえば当たり前なんだが。

質問を入力：妖魔とは？
回答：妖魔 is a tag used for supernatural or malevolent beings in the Idician’s 木乃葉編 (Yokai Encyclopedia). It includes various characters such as those from the category of 鬼神 (demon g
ods) like 大嶽丸, 酒呑童子, 夜叉, 那伽, and 鳩槃荼. The term also refers to the character 幽鬼 (Yaksya), who is a part of the 夜叉 (Yaksha) lineage and has a dual nature of being both
a demon and a benefactor to humans. Forest-dwelling spirits, Yaksyas are known for their reverence towards water and are sometimes called yaksya. Additionally, in the 8部1章, the character ユウキ is associated with a Yaksya, indicating a possible connection or shared existence.

内容はあってる。これだけうまくいくってことはコンテクストは2048に縛られてなさそうだな。フォルダ内の総文字数かなりあるし。すごい時代になったな～ほんと。
何せ長編だもんで自分でも設定を忘れてしまいそうになるけど、こいつに聞けば九割九分正しい情報が返ってくるのでとりあえずメモだけしとけばオールオッケーですね。人間がAIに負けた世界？（板橋ハウス）

実験

質問を入力: ナズナとマーリクはどうやってハナエと出会った？
回答:ナズナとマーリクはハナエと出会ったときの詳細はテキストによって明確に示されていない。しかし、マーリクがハナエを解放し、彼らが能力者をひとり確保することができたことが分かる。その後、ハナはマーリクに飛びかかっていたが、彼の体はびくりともしない。ナズナはハナエに対して自己紹介し、その後、ハナエについてマーリクから聞いている程度が分かった。

質問を入力: How did Nazuna and Marik meet Hanae?
回答:Nazuna instructed Marik wirelessly to find out Hanae’s intentions and true identity. They encountered Hanae in a cold place where it was difficult to walk barefoot. Hanae, who called h
erself by that name, approached Marrik and smiled. They asked Hanae why she was in that place, and Hanae replied that she was just there for some reason, finding it more surprising tha
t Marrik was there. They also discussed the coldness and Marrik’s body, wondering if it was human or something else. Hanae mentioned that she had a neurological disorder, and they continued their conversation.

原文ソースが日本語でも、質問は英語でした方が回答精度が高そうです。後者が正解。
ただケース・バイ・ケースなところはあるかもしれない。

質問を入力: オトリコの能力は？
回答:オトリコの能力は『強制舞踏』です。彼女の姿を見たり、彼女が立てた音、声を聞いたり、匂いを嗅ぐなど、彼女のあらゆる要素を五感のいずれかで察知した全ての人間に対して発動する。彼女が音楽に合せて踊っている間、術にかけられた人間は彼女とまったく同じ動きしかできなくなる。

質問を入力: What is Otoriko’s ability?
回答:Otoriko’s ability is called “強制舞踏” (Enforced Dance). It causes her physical appearance, the sounds she makes, her voice, and her scent to be detected by all humans through their fi
ve senses. When she dances in rhythm with music, anyone under the effect of her ability can only mimic her movements exactly. However, she is also bound to the dance and can easily fal
l into a trance-like state. While it is possible to inflict damage on an opponent by forcing them to dance in dangerous locations, it is quite difficult to do so. Additionally, she cannot choose who is affected by her ability, making it challenging to fight alongside her teammates.

これは英語の回答はちょっとミスってる。「It causes her physical appearance, the sounds she makes, her voice, and her scent to be detected by all humans through their five senses. (彼女の外見、発する音、声、香りを、すべての人間が五感を通して感知するようになる)」ってなってるから。日本語のほうが正解です。
単純な質問なら日本語の方が高性能か？それともどっちもどっちなのかしら。体感的にはトータルで見て英語の方が打率高い。日本語も十分すごいんだけど。

コードはこんな感じ。

from llama_index import SimpleDirectoryReader
import torch
from langchain.embeddings import HuggingFaceEmbeddings
from llama_index import GPTVectorStoreIndex, ServiceContext
from llama_index.embeddings import LangchainEmbedding
import transformers
from llama_index.llms import HuggingFaceLLM
from llama_index.prompts.prompts import SimpleInputPrompt
from transformers import AutoModelForCausalLM, AutoTokenizer, TextStreamer
from langchain.embeddings.huggingface import HuggingFaceEmbeddings

llm_model_path = "Loyal-Macaroni-Maid-7B"
embed_model_path = "intfloat/multilingual-e5-large"


def read_txt(path):
    with open(path, "r", encoding="utf-8") as f:
        return f.read()


# ドキュメントの読み込み
documents = SimpleDirectoryReader("木乃葉編").load_data()
print("documents :", documents)

# 埋め込みモデルの準備
embed_model = LangchainEmbedding(HuggingFaceEmbeddings(
    model_name=embed_model_path
))

# プロンプトのラッパーの準備
query_wrapper_prompt = SimpleInputPrompt(
    "Below is an instruction that describes a task. "
    "Write a response that appropriately completes the request.\n\n"
    "### Instruction:\n{query_str}\n\n### Response:"
)

# HuggingFaceLLMPredictorの準備
hf_predictor = HuggingFaceLLM(
    context_window=2048,
    max_new_tokens=256,
    generate_kwargs={"temperature": 0.25, "do_sample": True},
    tokenizer_name=llm_model_path,
    model_name=llm_model_path,
    device_map="auto",
    stopping_ids=[0, 1, 2],
    tokenizer_kwargs={"max_length": 2048},
    model_kwargs={"torch_dtype": torch.bfloat16, "trust_remote_code": True},
)
service_context = ServiceContext.from_defaults(
    chunk_size=512,
    llm=hf_predictor,
    embed_model=embed_model
)

# インデックスの準備
index = GPTVectorStoreIndex.from_documents(
    documents,
    service_context=service_context
)
index.storage_context.persist(persist_dir="./storage")

# クエリエンジンの準備
query_engine = index.as_query_engine(
    streaming=True,
    similarity_top_k=3
)

# 推論の実行
while True:
    q = input("\n質問を入力: ")
    response_stream = query_engine.query(q)
    print("回答: ")
    response_stream.print_response_stream()

Comments ( 0 )

No comments yet.