Skip to content

Item parameter generation

tmatta edited this page Jul 8, 2017 · 4 revisions

The item_gen function

The function item_gen facilitates the generation of item parameters from a range of items response models.

lsasim::item_gen(n_1pl = NULL, n_2pl = NULL, n_3pl = NULL, thresholds = 1,
                 b_bounds, a_bounds = NULL, c_bounds = NULL)  

The arguments n_1pl, n_2pl and n_3pl specify the number of one-, two-, and three-parameter items to be generated. The argument thresholds specifies the number of thresholds for the one- and two parameter items. Finally, the arguments b_bounds, a_bounds and c_bounds specify the bounds of the uniform distributions used to generate the b, a, and c parameters, respectively.

The item_gen function constrains the set of generated items to have an average b parameter of 0. When few items are generated, this constraint may cause the range of the generated b parameters to exceed the b_bounds statement. However, for large numbers of items, as long as the bounds of the b parameter are symmetric, the items should not exceed the specified bounds.

For the following examples, we will be generating I = 30 items.

I <- 30

Dichotomous IRT models

Dichotomous IRT models (i.e., 1PL, 2PL, or 3PL) can be expressed by the equation

$$ p(y_{ij} = 1 | \theta_{j}) = c_{i} + (1 - c_{i}) \frac{\text{exp} \left[ a_{i} (\theta_{j} - b_{i}) \right]} {1 + \text{exp} \left[ a_{i} (\theta_{j} - b_{i}) \right]} $$

where yi**j is the response to item i by respondent j, θj is the so-called for respondent j, and bi, ai, and ci, are the difficulty, discrimination, and pseudo-guessing parameters, respectively, for item i.

For a set of dichotomous items, lsasim::item_gen returns an I × 6 data frame. The first column in the data frame is labeled item and denotes the item ID. Columns two through four, labeled b, a, and c, are the item parameters for the item difficulty, discrimination, and pseudo-guessing, respectively. The fifth column, labeled k, indicates the number of thresholds for each item. The sixth column, labeled p, indicates weather the item is from a 1PL, 2PL, or 3PL model.


1PL Items

To generate item parameters from a 1PL (Rasch) model, we only need to specify n_1pl, the number of one-parameter items, and the bounds of the b parameters for the n_1pl items. Below we generate item parameters for n_1pl = I 1PL items. The bounds of the b parameter are constrained to b_bounds = (-2, 2) where -2 is the lowest generating value and 2 is the highest generating value.

gen1PL <- lsasim::item_gen(n_1pl = I, b_bounds = c(-2, 2))

Printing the first 6 rows of gen1PL, we can see that the b parameters have been generated while the a parameters were fixed to 1 and the c parameters were fixed to 0. Column k indicates the items are dichotomous (1 threshold) and column p indicates the items are from a 1PL generating model.

head(gen1PL)
##   item     b a c k p
## 1    1 -0.60 1 0 1 1
## 2    2  0.78 1 0 1 1
## 3    3 -0.75 1 0 1 1
## 4    4  0.18 1 0 1 1
## 5    5  1.19 1 0 1 1
## 6    6  0.16 1 0 1 1

The 30 generated b parameters of gen1PL range between -1.92 and 1.89, within the bounds set by b_bounds.


2PL Items

To generate item parameters from a 2PL model, we specify n_2pl, the number of two-parameter items, and the bounds for the b and a parameters. Below we generate item parameters for n_2pl = I 2PL items. The bounds of the b parameter are again constrained to b_bounds = (-2, 2) and the bounds of the a parameter are constrained to a_bounds = c(0.75, 1.25).

gen2PL <- lsasim::item_gen(n_2pl = I, b_bounds = c(-2, 2), 
                           a_bounds = c(0.75, 1.25))

Printing the first 6 rows of gen2PL, we can see that the b and a parameters have been generated while the the c parameters were fixed to 0. Column k indicates the items are dichotomous (1 threshold) and column p indicates the items are from a 2PL generating model.

head(gen2PL)
##   item     b    a c k p
## 1    1  0.25 1.00 0 1 2
## 2    2  1.45 0.77 0 1 2
## 3    3  1.57 1.17 0 1 2
## 4    4 -1.62 1.05 0 1 2
## 5    5  0.41 0.79 0 1 2
## 6    6 -1.05 0.87 0 1 2

The 30 generated b parameters of gen2PL range between -1.92 and 1.90, within the bounds set by b_bounds.


3PL Items

To generate item parameters from a 3PL model, we specify n_3pl, the number of three-parameter items, and the bounds for the b, a, and c parameters. Below we generate item parameters for n_3pl = I 3PL items. The bounds of the b a parameters are again constrained to b_bounds = (-2, 2) and a_bounds = c(0.75, 1.25). The bounds of the c parameter are set to c_bounds = c(0, .25).

gen3PL <- lsasim::item_gen(n_3pl = I, b_bounds = c(-2, 2), 
                           a_bounds = c(.75, 1.25), c_bounds = c(0, .25))

Printing the first 6 rows of gen2PL, we can see that the b, a, and c parameters have been generated. Column k indicates the items are dichotomous (1 threshold) and column p indicates the items are from a 3PL generating model.

head(gen3PL)
##   item     b    a    c k p
## 1    1 -0.49 1.10 0.20 1 3
## 2    2  0.27 1.08 0.07 1 3
## 3    3  0.46 0.91 0.06 1 3
## 4    4 -0.27 0.98 0.15 1 3
## 5    5  0.40 1.01 0.18 1 3
## 6    6 -2.02 1.24 0.11 1 3

The 30 generated b parameters of gen3PL range between -2.1 and 1.67. Of the 30 items, 2 items exceeded the generating bounds set.

print(gen3PL[which(gen3PL$b > 2 | gen3PL$b < -2), ])
##    item     b    a    c k p
## 6     6 -2.02 1.24 0.11 1 3
## 21   21 -2.10 0.78 0.12 1 3

Polytomous IRT models

The lsasim::item_gen function can also generate items for partial credit models

$$ p(y_{ij} = k | \theta_{j}) = \frac{\text{exp} \left[ \sum_{u = 1}^{k} a_{i} (\theta_{j} - b_{i} + d_{iu}) \right]} {\sum_{v = 1}^{K_{i}} \text{exp} \left[ \sum_{u = 1}^{v} a_{i} (\theta_{j} - b_{i} + d_{iu}) \right]} $$

where k is the response on item i by respondent j and Ki is the maximum score on item i. The parameter bi is the average difficulty for item i and di**u is the threshold parameter between scores u and u − 1 for item i.

For a set of partial credit items, lsasim::item_gen returns an I × Q data frame where Q = 6 + Ki − 1. That is, six of the P columns are the same as the columns from the dichotomous item data frame. The Ki − 1 new columns are the values of the item thresholds such that an item with Ki = 3 has two additional columns, d1 and d2. To arrive at the difficulty at the first response thresholds, compute b + d1, and b + d2 to compute the difficulty at the second threshold.


Partial credit items

To generate item parameters from a partial credit model, we specify n_1pl, the number of one-parameter items, thresholds, and b_bounds. Below we generate n_1pl = I 1PL item parameters, each with thresholds = 2 thresholds. The bounds of the b parameters are again constrained to b_bounds = (-2, 2).

genPC <- lsasim::item_gen(n_1pl = I, thresholds = 2, b_bounds = c(-2, 2))

Printing the first 6 rows of genPC, we can see that the b, d1, and d2 parameters have been generated while a and c are fixed to 1 and 0, respectively. Column k indicates the items have two thresholds and column p indicates the items are from a 1PL generating model.

head(genPC)
##   item     b    d1   d2 a c k p
## 1    1  0.24 -0.76 0.75 1 0 2 1
## 2    2 -1.34 -0.17 0.18 1 0 2 1
## 3    3  0.97 -0.57 0.56 1 0 2 1
## 4    4 -0.41 -0.65 0.65 1 0 2 1
## 5    5 -0.84 -0.18 0.17 1 0 2 1
## 6    6  0.04 -0.49 0.49 1 0 2 1

The 30 generated b parameters of genPC range between -1.34 and 1.23, within the bounds set by b_bounds. Furthermore, the 60 generated thresholds of genPC range between -1.78 and 1.97.


Generalized partial credit items

To generate item parameters from a generalized partial credit model, we specify n_2pl, the number of two-parameter items, thresholds, b_bounds, and a_bounds. Below we generate n_2pl = I 2PL item parameters, each with thresholds = 2 thresholds. The bounds of the b and a parameters are constrained to b_bounds = (-2, 2), and a_bounds = (.75, 1.25).

genGPC <- lsasim::item_gen(n_2pl = I, thresholds = 2, b_bounds = c(-2, 2), 
                           a_bounds = c(.75, 1.25))

Printing the first 6 rows of genGPC, we can see that the b, d1, d2, and a parameters have been generated while c is fixed to 0. Column k indicates the items have two thresholds and column p indicates the items are from a 2PL generating model.

head(genGPC)
##   item     b    d1   d2    a c k p
## 1    1  0.26 -0.24 0.25 0.87 0 2 2
## 2    2  0.70 -0.73 0.73 1.12 0 2 2
## 3    3  0.07 -0.11 0.11 1.23 0 2 2
## 4    4 -1.17 -0.30 0.30 1.19 0 2 2
## 5    5 -0.29 -0.42 0.42 0.98 0 2 2
## 6    6 -0.03 -0.40 0.40 1.25 0 2 2

The 30 generated b parameters of genGPC range between -1.41 and 1.32, within the bounds set by b_bounds. Furthermore, the 60 generated thresholds of genGPC range between -1.57 and 1.88.


Mixed type items

It is often the case where a test is to be constructed from multiple item types. For example, PISA 2012 was a mix of 1PL items and partial credit items. Drawing on the same arguments, lsasim::item_gen enables the generation of mixed type items. To do this, we extend n_1pl, n_2pl, n_3pl and thresholds to vectors. For the following examples, we will generate I = 20 items.

I <- 20

1PL and partial credit items

To generate item parameters from a 1PL and partial credit model, we specify a length-2 vector for both n_1pl and thresholds. The b_bounds remains as it has, and will be used to draw item parameters for both item types. Below we specify n_1pl = c(I/2, I/2) and thresholds = c(1, 2). Under this specification, n_1pl[1] corresponds to thresholds[1] and n_1pl[2] corresponds to thresholds[2]. Thus, we will generate 10 1PL items and 10 partial credit items. The single b_bounds = (-2, 2) indicates that all b parameters will be drawn from the same uniform distribution.

gen1PL_PC <- lsasim::item_gen(n_1pl = c(I/2, I/2), thresholds = c(1, 2), 
                              b_bounds = c(-2, 2))

Printing all 20 rows of gen1PL_PC, we can see that the b parameter has been generated for all items and d1 and d2 parameters have been generated for the last 10 items. Furthermore, k = 1 for the first 10 items and k = 2 for the last 10 items.

print(gen1PL_PC)
##    item     b    d1   d2 a c k p
## 1     1  0.21  0.00 0.00 1 0 1 1
## 2     2 -0.61  0.00 0.00 1 0 1 1
## 3     3 -1.76  0.00 0.00 1 0 1 1
## 4     4  0.45  0.00 0.00 1 0 1 1
## 5     5  0.68  0.00 0.00 1 0 1 1
## 6     6  0.94  0.00 0.00 1 0 1 1
## 7     7  1.62  0.00 0.00 1 0 1 1
## 8     8 -0.68  0.00 0.00 1 0 1 1
## 9     9  0.42  0.00 0.00 1 0 1 1
## 10   10  2.00  0.00 0.00 1 0 1 1
## 11   11 -0.99 -0.42 0.41 1 0 2 1
## 12   12  0.99 -0.75 0.75 1 0 2 1
## 13   13 -1.51 -0.06 0.06 1 0 2 1
## 14   14  0.48 -0.42 0.42 1 0 2 1
## 15   15  0.98 -0.55 0.56 1 0 2 1
## 16   16 -0.95 -0.69 0.69 1 0 2 1
## 17   17  0.11 -0.41 0.40 1 0 2 1
## 18   18 -0.70 -0.68 0.68 1 0 2 1
## 19   19 -0.32 -0.58 0.57 1 0 2 1
## 20   20  0.29 -0.51 0.50 1 0 2 1

The 20 generated b parameters of gen1PL_PC range between -1.76 and 2.00, within the bounds set by b_bounds. Furthermore, the 30 generated thresholds of gen1PL_PC range between -1.76 and 2.00.


3PL and generalized partial credit items

To generate item parameters from a 3PL and generalized partial credit model, we specify a n_2pl, n_3pl and thresholds as well as b_bounds, a_bounds, and c_bounds. Below we specify n_2pl = I/2, n_3pl = I/2 and thresholds = 2. Under this specification, thresholds does not require a vector as any n_3pl is contained to have thresholds = 1. Thus, n_2pl = 2 corresponds to thresholds = 2. Furthermore, b_bounds = (-2, 2) and a_bounds = c(.75, 1.25) will be used to generate item parameters for both models. However, c_bounds = c(0, 0.25) will only generate c parameters for those n_3pl = I/2 items.

gen3PL_GPC <- lsasim::item_gen(n_2pl = I/2, n_3pl = I/2, thresholds = 2,  
                               b_bounds = c(-2, 2), a_bounds = c(.75, 1.25), 
                               c_bounds = c(0, 0.25))

Printing all 20 rows of gen3PL_GPC, we can see that the b and a parameters have been generated for all items, the d1 and d2 parameters have been generated for the first 10 items, and the c parameter has been generated for the last 10 items. Furthermore, k = 2 and p = 2 for the first 10 items and k = 1 and p = 3 for the last 10 items.

print(gen3PL_GPC)
##    item     b    d1   d2    a    c k p
## 1     1 -0.15 -0.34 0.34 0.92 0.00 2 2
## 2     2  1.21 -0.74 0.73 1.04 0.00 2 2
## 3     3 -0.18 -0.09 0.08 0.86 0.00 2 2
## 4     4 -0.80 -0.48 0.47 1.14 0.00 2 2
## 5     5 -0.06 -0.33 0.33 1.16 0.00 2 2
## 6     6 -1.16 -0.19 0.19 1.10 0.00 2 2
## 7     7 -0.06 -0.53 0.53 0.79 0.00 2 2
## 8     8 -0.99 -0.66 0.66 1.02 0.00 2 2
## 9     9 -0.28 -0.07 0.06 0.99 0.00 2 2
## 10   10  0.16 -0.39 0.39 0.88 0.00 2 2
## 11   11 -0.18  0.00 0.00 0.78 0.12 1 3
## 12   12  0.29  0.00 0.00 1.21 0.05 1 3
## 13   13  0.51  0.00 0.00 0.78 0.16 1 3
## 14   14  0.40  0.00 0.00 1.13 0.11 1 3
## 15   15  1.90  0.00 0.00 1.05 0.09 1 3
## 16   16  0.99  0.00 0.00 1.13 0.14 1 3
## 17   17  1.76  0.00 0.00 1.12 0.20 1 3
## 18   18  0.62  0.00 0.00 1.21 0.13 1 3
## 19   19 -0.58  0.00 0.00 1.06 0.11 1 3
## 20   20 -1.04  0.00 0.00 1.01 0.10 1 3

The 20 generated b parameters of gen3PL_GPC range between -1.16 and 1.90, within the bounds set by b_bounds. Furthermore, the 30 generated thresholds of gen3PL_GPC range between -1.65 and 1.94.


1PL, PC, GPC, and 3PL items

Finally, we generate 20 items from four different models.

  • 5 1PL items
  • 5 partial credit items with three thresholds
  • 5 generalized partial credit items with two thresholds
  • 5 3PL items

The first aspect of the function below to note is the vector of thresholds, thresholds = c(1, 2, 3). This dictates that both n_1pl and n_2pl must also be length-3 vectors. The argument n_1pl = c(I/4, 0, I/4) specified I/4 1PL items, zero two-threshold partial credit items, and I/4 three-threshold partial credit items. Similarly, the argument n_2pl = c(0, I/4, 0) specifies zero 2PL items, I/4 two-threshold generalized partial credit items, and zero three-threshold generalized partial credit items. As mentioned above, n_3pl does not require a vector as 3PL items are constrained to one threshold. Thus, n_3pl = I/4 indicates i/4 3PL items. The b parameters for all items will be generated from b_bounds = c(-2, 2), the a parameters for the generalized partial credit items and 3PL items will be generated from a_bounds = c(0.75, 1.25) and the c parameters for the 3PL items will be generated from c_bounds = c(0, 0.25).

gen1PL_3PL_PC_GPC <- lsasim::item_gen(n_1pl = c(I/4, 0, I/4), 
                                      n_2pl = c(0, I/4, 0), n_3pl = I/4,
                                      thresholds = c(1, 2, 3), 
                                      b_bounds = c(-2, 2), 
                                      a_bounds = c(0.75, 1.25), 
                                      c_bounds = c(0, 0.25))

Printing all 20 rows of gen1PL_3PL_PC_GPC, we can use the that the k and p values to identify which items correspond to which models.

print(gen1PL_3PL_PC_GPC)
##    item     b    d1    d2   d3    a    c k p
## 1     1  0.95  0.00  0.00 0.00 1.00 0.00 1 1
## 2     2  1.00  0.00  0.00 0.00 1.00 0.00 1 1
## 3     3 -1.51  0.00  0.00 0.00 1.00 0.00 1 1
## 4     4  1.30  0.00  0.00 0.00 1.00 0.00 1 1
## 5     5 -0.72  0.00  0.00 0.00 1.00 0.00 1 1
## 6     6 -0.89 -0.17  0.02 0.14 1.00 0.00 3 1
## 7     7 -0.63 -0.60  0.02 0.59 1.00 0.00 3 1
## 8     8  1.02 -0.93  0.35 0.59 1.00 0.00 3 1
## 9     9  0.64 -1.03 -0.17 1.21 1.00 0.00 3 1
## 10   10 -0.69 -0.52  0.04 0.47 1.00 0.00 3 1
## 11   11  0.54 -0.69  0.68 0.00 1.19 0.00 2 2
## 12   12 -0.90 -0.71  0.70 0.00 0.97 0.00 2 2
## 13   13 -0.90 -0.61  0.60 0.00 1.13 0.00 2 2
## 14   14  0.83 -0.50  0.50 0.00 0.85 0.00 2 2
## 15   15 -0.82 -0.75  0.75 0.00 1.03 0.00 2 2
## 16   16  1.40  0.00  0.00 0.00 1.15 0.18 1 3
## 17   17  1.89  0.00  0.00 0.00 0.96 0.01 1 3
## 18   18  1.33  0.00  0.00 0.00 1.13 0.04 1 3
## 19   19 -0.28  0.00  0.00 0.00 1.11 0.12 1 3
## 20   20 -1.16  0.00  0.00 0.00 0.99 0.04 1 3

The 20 generated b parameters of gen1PL_3PL_PC_GPC range between -1.51 and 1.89, within the bounds set by b_bounds. Furthermore, the 30 generated thresholds of gen1PL_3PL_PC_GPC range between -1.61 and 1.89.

Clone this wiki locally