Examples for RAG Benchmarking - LLM as a judge #96

sreanikdk · 2024-09-27T15:10:26Z

This PR adds examples for 'RAG Benchmarking - LLM as a judge', as a new section.

romaniam · 2024-10-02T14:55:47Z

scripts/step03_explore_examples/06_rag_benchmarking/src/evaluate_with_golden_testset.py

+from rag_benchmark_utils.common_utils import create_retriever, validate_response, print_results
+from helpers.config import TERRAFORM_DOCS_TABLE_NAME
+
+RELATIVE_FILE_PATH = Path("data/golden_test_set.csv")


Not a must but might be easier to configure if moved to the config file

romaniam · 2024-10-02T15:01:03Z

scripts/step03_explore_examples/06_rag_benchmarking/src/evaluate_without_golden_testset.py

+        try:
+            log.info(f"Asking question: {query}")
+            result = qa_chain.invoke({"query": query})
+            actual_answer = result.get("result", "N/A")


Again not a must but constants for N/A and Failed would be better in this scenario

romaniam · 2024-10-02T15:02:47Z

scripts/step03_explore_examples/06_rag_benchmarking/src/generate_golden_testset.py

+    multi_context: 0.4,
+    reasoning: 0.1
+}
+OUTPUT_PATH = Path('data/golden_test_set.csv')


Not a must but might be easier to configure if moved to the config file

romaniam

Some small remarks but in general LGTM

sreanikdk added 2 commits September 27, 2024 14:10

Adding RAG benchmarking examples

0f2f029

Readme updated

c083204

sreanikdk requested review from SeriousSem, romaniam and EvgeniiSkrebtcov September 27, 2024 15:11

SeriousSem approved these changes Oct 2, 2024

View reviewed changes

romaniam reviewed Oct 2, 2024

View reviewed changes

romaniam approved these changes Oct 2, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Examples for RAG Benchmarking - LLM as a judge #96

Examples for RAG Benchmarking - LLM as a judge #96

sreanikdk commented Sep 27, 2024

romaniam Oct 2, 2024

romaniam Oct 2, 2024

romaniam Oct 2, 2024

romaniam left a comment

Examples for RAG Benchmarking - LLM as a judge #96

Are you sure you want to change the base?

Examples for RAG Benchmarking - LLM as a judge #96

Conversation

sreanikdk commented Sep 27, 2024

romaniam Oct 2, 2024

Choose a reason for hiding this comment

romaniam Oct 2, 2024

Choose a reason for hiding this comment

romaniam Oct 2, 2024

Choose a reason for hiding this comment

romaniam left a comment

Choose a reason for hiding this comment