Raven's Research Page

❯

❯

0407 Daily Report

0407 Daily Report

Apr 07, 20261 min read

diary
rag

Today’s reading :

Research inspiring :

現有 Benchmark 普遍會需要人工審核，或是需要人工加入進行出題
透過 LLM 生成的 Benchmark 普遍是透過 Document-based 的情況進行出題
- 所以普遍會是以一對一得形式來進行生成
- 有機率會出現其實有許多不同隱藏答案出現在其他資料來源
- 這種情況就會導致 Benchmark 分數不理想，並且容易有分數上的誤判
現有 GraphRAG 普遍可以透過 Graph 這種 well-structured 的資料結構來解決 multi-hop question

Project Progress :

finish the docker image, but still can not test on the VM.
adjust the package structure
- in branch refactor-api-docker

Side Project :

Build this page on Github Pages

TODO :

Reading more papers are about Benchmark evaluation
- HOTPOTQA
- 2WikiMultiHopQA

Graph View

Today’s reading :
Research inspiring :
Project Progress :
Side Project :
TODO :

Created with Quartz v4.5.2 © 2026

GitHub
Discord Community