This project analyses KEML files statistically. For each KEML file it produces:
Since this project uses EMF components, it is best viewed and adapted from Eclipse. If you load it there, make sure that you have the right project natures, that is modeling and maven. If you freshly added maven to this project in Eclipse, it might be necessary to run Maven -> Update project on it before using maven to install the necessary libraries.
This project is a basic maven based java application you can run in all normal ways (command line, IDE...). It has one optional input: the base folder. If none is given, it creates statistics on the introductory example from keml.sample - assuming that project is located on the same level as keml.sample. All output files are stored in the folder analysis.
In analysis, each filename starts with a prefix pre that is equal to the KEML file name.
Currently, three types of statistics are generated:
General statistics are stored under
This CSV file holds a Message Part and a Knowledge Part where it gives statistics per Conversation Partner. The Message Part gives counts for sends and receives, as well as interruptions. The Knowledge Part counts PreKnowledge and New information, split into Facts and Instructions. It also counts repetitions.
Argumentation statistics are stored under pre-arguments.csv.
This CSV file consists of a table that counts attacks and supports between facts (F) and instructions (I) of all conversation partners (including the human author).
Trust Scores are given as Excel (xlsx) files pre-w n--arguments.csv where n is the weight of the trust computation formula. Each file depicts four scenarios (a-d) described under Initial Trust. Each scenario consists of two columns, one (iT) that lists the initial trust score for each information and one (T) that lists the (final) trust score. Additionally, there are columns to describe the information i precisely:
- The time stamp (-1 for pre knowledge) with the background color stating whether i is fact (green) or instruction (orange)
- The message column with the background color blue for LLM messages and yellow for all other messages
- The argument count #Arg counting how many other information influence i directly
- The repetition count #Rep counting the number of repetitions of i
Trust T into an information i is computed based on initial trust
Here, restrict limits the computed trust to a value in [-1.0,... 1.0].
The weight
The phenomenon that someone trusts more into an information the more often it was heared is known as (illusiory) truth effect.
We compute it as the of proportion of repetitions of the information
The repetition score can only contribute positively to our trust and we have
The argumentative trust
Here,
The initial trust into an information i could be assigned individually to each information. In this analysis module, it is currently evaluated in four scenarios that distinguish between the LLM LLM and all other conversation partners P:
- a) trust all completely (
$T_{init}(P) = 1$ ;$T_{init}(LLM)=1$ ) - b) trust the LLM less (
$T_{init}(P) = 1$ ;$T_{init}(LLM)=0.5$ ) - c) trust the LLM more than others (
$T_{init}(P) = 0.5$ ;$T_{init}(LLM)=1$ ) - d) limit trust into all (
$T_{init}(P) = 0.5$ ;$T_{init}(LLM)=0.5$ )
We write
The license of this project is that of the group.