Title: | Modeling of Revealed Preferences Matchings |
---|---|
Description: | Statistical estimation of revealed preference models from data collected on bipartite matchings. The models are for matchings within a bipartite population where individuals have utility for people based on known and unknown characteristics. People can form a partnership or remain unpartnered. The model represents both the availability of potential partners of different types and preferences of individuals for such people. The software estimates preference parameters based on sample survey data on partnerships and population composition. The simulation of matchings and goodness-of-fit are considered. See Goyal, Handcock, Jackson, Rendall and Yeung (2022) <doi:10.1093/jrsssa/qnad031>. |
Authors: | Mark S. Handcock [aut, cre] , Ryan M. Admiraal [ctb], Fiona C. Yeung [ctb], Heide M. Jackson [ctb], Michael S. Rendall [ctb], Shuchi Goyal [ctb] |
Maintainer: | Mark S. Handcock <[email protected]> |
License: | GPL-3 + file LICENSE |
Version: | 0.7-3 |
Built: | 2024-11-17 04:55:36 UTC |
Source: | https://github.com/handcock/rpm |
An integrated set of tools to fit a revealed preference model for men and women of certain characteristics (or shared characteristics) of people of the opposite sex. The model assumes a one-to-one stable matching using an observed set of matchings and a set of (possibly dyadic) covariates to estimate the parameters for linear equations of utilities. It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
For a complete list of the functions, use library(help="rpm")
or
read the rest of the manual.
When publishing results obtained using this package the original authors are to be cited as:
Mark S. Handcock, Ryan M. Admiraal, Fiona C. Yeung, Heide M. Jackson, Michael S. Rendall and Shuchi Goyal (2022) rpm: Modeling of Revealed Preferences Matchings R package, Los Angeles, CA. Version 0.70, https://github.com/handcock/rpm.
All programs derived from this package must cite it. For complete citation
information, usecitation(package="rpm")
.
For details on how to construct data for input to rpm()
see the documentation:
help(fauxmatching)
For information on the current terms that can be used in formulas for rpm()
see the documentation:
help("rpm-terms")
No return value, called for side effects.
Mark S. Handcock <[email protected]>
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets, International Economic Review, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems, Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
Useful links:
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit) # For details on how to construct data for input: help(fauxmatching) # For information on the current terms that can be used in formulas: help("rpm-terms")
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit) # For details on how to construct data for input: help(fauxmatching) # For information on the current terms that can be used in formulas: help("rpm-terms")
Compute an analysis of variance table for one or more rpm fits.
## S3 method for class 'rpm' anova(object, ...) anova_rpmlist(object, ...)
## S3 method for class 'rpm' anova(object, ...) anova_rpmlist(object, ...)
object , ...
|
Specifying a single object gives a sequential analysis of variance table for that fit. That is, the reductions in the residual sum of squares as each term of the formula is added in turn are given in the rows of a table, plus the residual sum of squares.
The table will contain F statistics (and P values) comparing the mean square for the row to the residual mean square.
If more than one object is specified, the table has a row for the residual degrees of freedom and sum of squares for each model. For all but the first model, the change in degrees of freedom and sum of squares is also given. (This only make statistical sense if the models are nested.) It is conventional to list the models from smallest to largest, but this is up to the user.
Optionally the table can include test statistics. Normally the F statistic
is most appropriate, which compares the mean square for a row to the
residual sum of squares for the largest model considered. If scale
is specified chi-squared tests can be used. Mallows' statistic
is the residual sum of squares plus twice the estimate of
times the residual degrees of freedom.
If any of the objects do not have estimated log-likelihoods, produces an
error, unless eval.loglik=TRUE
.
An object of class "anova"
inheriting from class
"data.frame"
.
The comparison between two or more models will only be valid if they are fitted to the same dataset. This may be a problem if there are missing values.
The model fitting function rpm
, anova
,
logLik.rpm
for adding the log-likelihood to an existing
rpm
object.
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") anova(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") anova(fit)
Auxiliary function as user interface for fine-tuning RPM model fitting algorithm, which computes the MLPLE of the Revealed Preferences Model via optimization.
control.rpm( init_theta = NULL, algorithm = "NLOPT_LD_SLSQP", print_level = 0, xtol_rel = 1e-08, ftol_rel = 1e-08, ftol_abs = 1e-06, lower.bound = -10, upper.bound = 10, maxeval = 2000, bs.maxeval = 2000, bs.xtol_rel = 1e-08, bs.save.data = FALSE, check_derivatives = FALSE, bootstrap = TRUE, hessian = FALSE, seed = NULL, parallel.type = "PSOCK", parallel.ncores = 1, ncores = 1, constraints = c("none", "M_single"), logodds_single = FALSE, save.data = TRUE, robust.cov = FALSE, local_opts = list(algorithm = "NLOPT_LD_SLSQP", xtol_rel = 1e-07, maxeval = maxeval), nbootstrap = 50, nbootstrap.SD = 20, large.population.bootstrap = 5000, alpha = 0.05 )
control.rpm( init_theta = NULL, algorithm = "NLOPT_LD_SLSQP", print_level = 0, xtol_rel = 1e-08, ftol_rel = 1e-08, ftol_abs = 1e-06, lower.bound = -10, upper.bound = 10, maxeval = 2000, bs.maxeval = 2000, bs.xtol_rel = 1e-08, bs.save.data = FALSE, check_derivatives = FALSE, bootstrap = TRUE, hessian = FALSE, seed = NULL, parallel.type = "PSOCK", parallel.ncores = 1, ncores = 1, constraints = c("none", "M_single"), logodds_single = FALSE, save.data = TRUE, robust.cov = FALSE, local_opts = list(algorithm = "NLOPT_LD_SLSQP", xtol_rel = 1e-07, maxeval = maxeval), nbootstrap = 50, nbootstrap.SD = 20, large.population.bootstrap = 5000, alpha = 0.05 )
init_theta |
vector; numeric vector of starting parameter values. This value and other possible starting values are applied to find a good optimizer. This can either have length the number of parameters corresponding to the terms in the formula or in addition the equilibrium constraints. |
algorithm |
string; The optimization algorithm to use. See |
print_level |
integer; possible values: 0, 1, 2, or 3. This controls how much output is shown during the optimization process. Possible values: 0 (default): no output; 1: show iteration number and value of objective function; 2: 1 + show value of equalities/constraints; 3: 2 + show value of controls. |
xtol_rel |
scalar; Stop when an optimization step (or an estimate of the optimum) changes every parameter by less than xtol_rel multiplied by the absolute value of the parameter. If there is any chance that an optimal parameter is close to zero, you might want to set an absolute tolerance with xtol_abs as well. Criterion is disabled if xtol_rel is non-positive. Possible values: xtol_rel > 0. Default value: 1.0e-08. |
ftol_rel |
scalar; Stop when an optimization step (or an estimate of the optimum) changes the log-likelihood by less than ftol_rel multiplied by the absolute value of the log-likelihood. |
ftol_abs |
scalar; Stop when an optimization step (or an estimate of the optimum) changes the log-likelihood by less than ftol_abs. tolerance with xtol_abs as well. Criterion is disabled if ftol_abs is non-positive. Possible values: ftol_abs > 0. Default value: 1.0e-06. |
lower.bound |
numeric; lower bounds on the parameter estimates (that is, the beta and gamma parameters in the model). Can be a vector of the same size as the coefficient vector or a single number which is used for all bounds. |
upper.bound |
numeric; upper bounds on the parameter estimates (that is, the beta and gamma parameters in the model). Can be a vector of the same size as the coefficient vector or a single number which is used for all bounds. |
maxeval |
integer; Stop when the number of function evaluations exceeds maxeval. This is not a strict maximum: the number of function evaluations may exceed maxeval slightly, depending upon the algorithm. Criterion is disabled if maxeval is non-positive. Default value: 1000. |
bs.maxeval |
integer; Stop the bootstrap optimization when the number of function evaluations exceeds bs.maxeval. This is not a strict maximum: the number of function evaluations may exceed bs.maxeval slightly, depending upon the algorithm. Criterion is disabled if bs.maxeval is non-positive. Default value:50 |
bs.xtol_rel |
scalar; Stop the bootstrap optimization when an optimization step (or an estimate of the optimum) changes every parameter by less than bs.xtol_rel multiplied by the absolute value of the parameter. See the parameter xtol_rel for details. |
bs.save.data |
logical; Should the bootstrapped data be saved in the bootstrap return list (as components
|
check_derivatives |
logical; Compare the user-supplied analytic gradients with the finite difference approximations. |
bootstrap |
logical; If 'TRUE' use a bootstrap to compute the standard errors and associated
covariance matrices. If 'FALSE' base the standard errors and associated
covariance matrices on the Hessian of the
(constrained) log-likelihood.
In all cases the extended covariance matrix is returned in |
hessian |
logical; Depreciated. The negation of the 'bootstrap' argument. |
seed |
Seed value (integer) for the random number generator. See
|
parallel.type |
The type of cluster to run. The typical choices are "MPI" and "PSOCK", where you must have "MPI" installed to use the former. The default is "PSOCK". |
parallel.ncores |
count; Depreciated. The renamed 'ncores' argument. |
ncores |
Number of processors to use in the bootstrap computations. The default is 1, that is no parallel processing. |
constraints |
string; Additional constraints to force the proportions of singles to match the (weighted) population estimates? This should not be required, but does stabilize the estimates in cases where there is much uncertainty. The possible values are "none" and "M_single" (the numbers of male singles of each type are reproduced). Note that adding constraints leads to over-constrained optimization which may fail. |
logodds_single |
logical; Should the log-odds ratio of being single relative to a randomly chosen person of the same sex from the the population be returned. If FALSE the log-odds of being single relative is returned. This is a pure preference parameter. |
save.data |
logical; Should the data be saved in the return list (as components
|
robust.cov |
logical; Should the covariance matrix of the estimates be computed using a robust method (MASS::cov.mcd)? Only use if the bootstrap is unstable. |
local_opts |
list; list of options for |
nbootstrap |
integer; Number of bootstrap resamples to take in the estimation of the covariance matrix of the parameter estimates. |
nbootstrap.SD |
integer; Number of bootstrap resamples to take in the estimation of the variances used in the studentized bootstrap. This is run for each nbootstrap sample and so is expensive. |
large.population.bootstrap |
integer; If the population size exceeds |
alpha |
proportion; Type I error rate for the confidence intervals produced by the bootstrap. |
This function is only used within a call to the rpm
function.
Some of the arguments are not yet fully implemented. It will evolve slower to incorporate more arguments as the package develops.
A list with arguments as components.
This data set represents a simulation of a bipartite matching. The data set is named fauxmatching. Its primary use is to illustrate the fitting of a Revealed Preference Matchings Model (rpm). The model assumes a one-to-one stable matching using an observed set of matchings and a set of (possibly dyadic) covariates to estimate the parameters for linear equations of utilities. This provides such data for a matching between men and women of certain characteristics (or shared characteristics) of people of the opposite sex.
data(fauxmatching)
data(fauxmatching)
fauxmatching
is a list containing a pair of data.frame
objects: Xdata
and Zdata
.
Xdata
is for women. Each row is a woman, each column is a variable on that women
or her partnerships. The women's ID variable s called pid
and
the variable with the ID of the women's partner is called pair_id
. If the women is single the men's ID is NA.
Zdata
is for men. Each row is a man, each column is a variable on that men
The men's ID variable is called pid
.
The ID of the person's partner. This is in both Xdata
and Zdata
.
The indicator that the person was sampled directly (as distinct from being included as the match of a directly sampled
person. All single people are directly sampled. This is in both Xdata
and Zdata
.
The pairings are determined by the pair_id
variable in Xdata
.
If that variable is NA then the women is
assumed to be single. If men are listed in Zdata
and are not partnered then they are assumed single.
Weights are specified by three optional variables in Xdata
.
The weight variable for women. The sum of the weights of the sampled women is the number of women in the population.
The weight variable for men. The sum of the weights of the sampled men is the number of men in the population.
The weight variable for pairs.
No return value, called for side effects.
The data set is simulation based upon an rpm model fit to data from the 2008 SIPP.
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") summary(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") summary(fit)
This code allows the self-matched option
Gale_Shapley(U, V, return.data.frame = FALSE, cpp = TRUE, nmax = 10 * nrow(U))
Gale_Shapley(U, V, return.data.frame = FALSE, cpp = TRUE, nmax = 10 * nrow(U))
U |
The utility matrix for the women's side. Each row is a woman, each column is a man.
The matrix entry (i,j) is the utility that woman |
V |
The utility matrix for the men's side. Each column is a man, each row is a woman.
The matrix entry (i,j) is the utility that man |
return.data.frame |
logical Should a |
cpp |
logical Should the |
nmax |
count The maximum number of iterations of the inner loop within the Gale-Shapley algorithm. This can be reduced to speed up the algorithm at the potential cost of many partnerships being non-equilibruim. |
The function return depends on the return.data.frame
value.
If TRUE, it returns
data.frame |
a two-column |
If FALSE, it returns the following matrix:
mu |
If |
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
rpm
gof.rpm
...
It is typically based on the estimate from a rpm()
call.
gof(object, ...) ## S3 method for class 'rpm' gof( object, ..., empirical_p = TRUE, compare_sim = "sim-est", control = object$control, reboot = FALSE, verbose = FALSE ) ## S3 method for class 'gofrpm' plot(x, ..., cex.axis = 0.7, main = "Goodness-of-fit diagnostics")
gof(object, ...) ## S3 method for class 'rpm' gof( object, ..., empirical_p = TRUE, compare_sim = "sim-est", control = object$control, reboot = FALSE, verbose = FALSE ) ## S3 method for class 'gofrpm' plot(x, ..., cex.axis = 0.7, main = "Goodness-of-fit diagnostics")
object |
list; an object of class |
... |
Additional arguments, to be passed to lower-level functions. |
empirical_p |
logical; (Optional) If TRUE the function returns the empirical p-value of the sample
statistic based on |
compare_sim |
string; describes which two objects are compared to compute simulated goodness-of-fit
statistics; valid values are |
control |
A list of control parameters for algorithm tuning. Constructed using
|
reboot |
logical; if this is |
verbose |
logical; if this is |
x |
a list, usually an object of class gofrpm |
cex.axis |
the magnification of the text used in axis notation; |
main |
Title for the goodness-of-fit plots. |
The function rpm
is used to fit a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
The model represents the dyadic utility functions as deterministic linear utility functions of
dyadic variables. These utility functions are functions of observed characteristics of the women
and men.
These functions are entered as terms in the function call
to rpm
. This function simulates from such a model.
gof.rpm
returns a list consisting of the following elements:
observed_pmf |
numeric matrix giving observed probability mass distribution over different household types |
model_pmf |
numeric matrix giving expected probability mass distribution from |
obs_chi_sq |
the count-based observed chi-square statistic comparing marginal distributions of the population the data and the model estimate |
obs_chi_sq_cell |
the contribution to the observed chi-squared statistic by household type |
obs_kl |
the Kullback-Leibler (KL) divergence computed by comparing the observed marginal distributions to the
expected marginal distribution based on the |
obs_kl_cell |
the contribution to the observed KL divergence by household type |
empirical_p_chi_sq |
the proportion of simulated chi-square statistics that are greater than or equal to the observed chi-square statistic |
empirical_p_kl |
the proportion of simulated KL divergences that are greater than or equal to the observed KL divergence |
chi_sq_simulated |
vector of size |
kl_simulated |
vector of size |
chi_sq_cell_mean |
Mean contributions of each household type to the simulated chi_sq statistic |
chi_sq_cell_sd |
Standard deviation of the contributions of each household type to the simulated chi_sq statistics |
chi_sq_cell_median |
Median contributions of each household type to the simulated chi_sq statistic |
chi_sq_cell_iqr |
Interquartile range of the contributions of each household type to the simulated chi_sq statistics |
kl_cell_mean |
Mean contributions of each household type to the simulated KL divergences |
kl_cell_sd |
Standard deviation of the contributions of each household type to the simulated KL divergencesc |
kl_cell_median |
Median contributions of each household type to the simulated KL divergences |
kl_cell_iqr |
Interquartile range of the contributions of each household type to the simulated KL divergences |
gof(rpm)
: Calculate goodness-of-fit statistics for Revealed Preference Matchings Model based on observed data
plot(gofrpm)
: plot.gofrpm
plots diagnostics such empirical p-value
based on chi-square statistics and KL divergences.
See rpm
for more information on these models.
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") a <- gof(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") a <- gof(fit)
Returns a list containing the terms in a given formula
list_rhs.formula(object)
list_rhs.formula(object)
object |
formula A formula having a right-hand-side that can be interpretated as a rpm specification.
returns a list containing terms in a given
formula, handling |
list_rhs.formula
returns a list of formula terms, with an additional numerical vector attribute "sign"
with of the same length, giving the corresponding term's sign as +1
or -1
.
logLik
method for ['rpm'] fits.A function to return the log-likelihood associated with an
rpm
fit
## S3 method for class 'rpm' logLik(object, ...)
## S3 method for class 'rpm' logLik(object, ...)
object |
|
... |
Other arguments to the likelihood functions. |
a logLik
object.
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") logLik(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") logLik(fit)
Calculate the null model likelihood
logLikNull(object, ...) ## S3 method for class 'rpm' logLikNull(object, ...)
logLikNull(object, ...) ## S3 method for class 'rpm' logLikNull(object, ...)
object |
a fitted model. |
... |
further arguments to lower-level functions.
|
logLikNull
returns an object of type logLik
if it is
able to compute the null model probability, and NA
otherwise.
logLikNull(rpm)
: A method for ['rpm'] fits to compute the null likelihood (that is, relative to the constant only model).
A thin wrapper around ['print'] that captures its output and prints it as a ['message'], usually to STDERR. Tis is part of ['statnet.common'].
message_print(..., messageArgs = NULL)
message_print(..., messageArgs = NULL)
... |
arguments to ['print']. |
messageArgs |
a list of arguments to be passed directly to ['message']. |
No return value, called for side effects.
cat(1:5) print(1:5) message_print(1:5) # Looks the same (though may be in a different color on some frontends). suppressMessages(print(1:5)) # Still prints suppressMessages(message_print(1:5)) # Silenced
cat(1:5) print(1:5) message_print(1:5) # Looks the same (though may be in a different color on some frontends). suppressMessages(print(1:5)) # Still prints suppressMessages(message_print(1:5)) # Silenced
microsimulate
simulates a population of the pairs and singles
from a Revealed Preference Matchings Model. It is typically based on the estimate from a rpm()
call.
microsimulate( object, nsim = 1, seed = NULL, pmfW_N = NULL, pmfM_N = NULL, large.population = TRUE, bootstrap = FALSE, control = control.rpm(), counts.only = FALSE, verbose = FALSE )
microsimulate( object, nsim = 1, seed = NULL, pmfW_N = NULL, pmfM_N = NULL, large.population = TRUE, bootstrap = FALSE, control = control.rpm(), counts.only = FALSE, verbose = FALSE )
object |
list; an object of class |
nsim |
Number of matchings to be randomly drawn from the given model on the set of all matchings / singles. |
seed |
integer; (Optional) random number seed. |
pmfW_N |
vector; The population count of the number of women of each type. This should be compatible with the type in the object. |
pmfM_N |
vector; The population count of the number of men of each type. This should be compatible with the type in the object. |
large.population |
logical; If TRUE a large population approximation is used to generate
the matchings (rather than the individual level generation of utilities). This is much faster and
uses a lot less memory. It is TRUE by default. If used, a sample is drawn rather than the
population being returned. The sample size is controlled by |
bootstrap |
logical; If TRUE the original population is sampled from. If FALSE the population underlying the fitted model is sampled from. |
control |
A list of control parameters for algorithm tuning. Constructed using
|
counts.only |
logical; If TRUE only the matrices of counts and the PMF of the population of households is returned. If FALSE It is FALSE by default. |
verbose |
logical; Should verbose messages be printed out. |
The function requites the numbers of women of each type and the number of men of each type to be specified.
The function rpm
is used to fit a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
The model represents the dyadic utility functions as deterministic linear utility functions of
dyadic variables. These utility functions are functions of observed characteristics of the women
and men.
These functions are entered as terms in the function call
to rpm
. This function simulates a population from such a model.
A list of lists, each a simulation from the population. Each of the simulation lists contains
components population
being a list with components Xdata
and Zdata
(for use with rpm()
).
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") num_women = fit$N*exp(fit$gw) num_men = fit$N*exp(fit$gm) pmfW_N <- round(fit$pmfW * num_women) pmfM_N <- round(fit$pmfM * num_men) a <- microsimulate(fit, pmfW_N=pmfW_N, pmfM_N=pmfM_N)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") num_women = fit$N*exp(fit$gw) num_men = fit$N*exp(fit$gm) pmfW_N <- round(fit$pmfW * num_women) pmfM_N <- round(fit$pmfM * num_men) a <- microsimulate(fit, pmfW_N=pmfW_N, pmfM_N=pmfM_N)
rpm
estimates the parameters of a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
rpm( formula, Xdata, Zdata, Xid = NULL, Zid = NULL, pair_id = NULL, X_w = NULL, Z_w = NULL, pair_w = NULL, sampled = NULL, sampling_design = "stock-flow", fixed.margins = NULL, control = control.rpm(), verbose = FALSE )
rpm( formula, Xdata, Zdata, Xid = NULL, Zid = NULL, pair_id = NULL, X_w = NULL, Z_w = NULL, pair_w = NULL, sampled = NULL, sampling_design = "stock-flow", fixed.margins = NULL, control = control.rpm(), verbose = FALSE )
formula |
formula; an |
Xdata |
data.frame for women. Each row is a woman, each column is a variable on that women
or her partnerships. It must contain the women's ID variable (see |
Zdata |
data.frame for men. Each row is a man, each column is a variable on that men
It must contain the men's ID variable (see |
Xid |
string The name of the variable in |
Zid |
string The name of the variable in |
pair_id |
string The name of the variable in |
X_w |
string The name of the variable in |
Z_w |
string The name of the variable in |
pair_w |
string The name of the variable in |
sampled |
string The name of the logical variable in |
sampling_design |
string; The name of the sampling protocol used to select the survey data. Valid values are
|
fixed.margins |
list If not NULL the numbers of men and women (i.e, in |
control |
A list of control parameters for algorithm tuning. Constructed using
|
verbose |
logical; if this is |
The pairings are determined by the pair_id
variable in Xdata
.
If that variable is NA then the women is
assumed to be single. If men are listed in Zdata
and are not partnered then they are assumed single.
Weights are specified by three optional variables in Xdata
.
: This is character string of the name of the weight variable for women. The sum of the weights should be the number of women in the population.
: This is character string of the name of the weight variable for men. The sum of the weights should be the number of men in the population.
: This is character string of the name of the weight variable for pairs.
rpm
returns an object of class rpm.object
that is a list consisting of the following elements:
coef |
The maximum psuedo-likelihood estimate of |
coefficients |
The bias-corrected bootstrap estimate of |
loglik |
The value of the maximized log-likelihood. |
exitflag |
integer value with the status of the optimization (4 is success as
|
call |
the call that was made to |
x0 |
vector with starting values for the optimization. |
message |
more informative message with the status of the optimization. |
iterations |
number of iterations that were executed. |
objective |
value if the objective function in the solution. |
solution |
optimal value of the controls. |
version |
version of NLopt that was used. |
covar |
Approximate covariance matrix of the estimates. |
eq |
Values from the equality constraints. Larger values indicate non-convergence. |
sample |
A matrix with the number of rows the MCMC sample size and the number of rows the number of parameters. |
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
control.rpm, summary.rpm, print.rpm
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit)
rpm_MLPLE
estimates the parameters of a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
rpm_MLPLE( formula, Xdata, Zdata, Xid = NULL, Zid = NULL, pair_id = NULL, X_w = NULL, Z_w = NULL, pair_w = NULL, sampled = NULL, sampling_design = "stock-flow", fixed.margins = fixed.margins, control = control.rpm(), verbose = FALSE )
rpm_MLPLE( formula, Xdata, Zdata, Xid = NULL, Zid = NULL, pair_id = NULL, X_w = NULL, Z_w = NULL, pair_w = NULL, sampled = NULL, sampling_design = "stock-flow", fixed.margins = fixed.margins, control = control.rpm(), verbose = FALSE )
formula |
formula; an |
Xdata |
data.frame for women. Each row is a woman, each column is a variable on that women
or her partnerships. It must contain the women's ID variable (see |
Zdata |
data.frame for men. Each row is a man, each column is a variable on that men
It must contain the men's ID variable (see |
Xid |
string The name of the variable in |
Zid |
string The name of the variable in |
pair_id |
string The name of the variable in |
X_w |
string The name of the variable in |
Z_w |
string The name of the variable in |
pair_w |
string The name of the variable in |
sampled |
string The name of the logical variable in |
sampling_design |
string; The name of the sampling protocol used to select the survey data. Valid values are
|
fixed.margins |
list If not NULL the numbers of men and women (i.e, in |
control |
A list of control parameters for algorithm tuning. Constructed using
|
verbose |
logical; if this is |
It is usually called via the rpm
function.
The pairings are determined by the pair_id
variable in Xdata
.
If that variable is NA then the women is
assumed to be single. If men are listed in Zdata
and are not partnered then they are assumed single.
Weights are specified by three optional variables in Xdata
.
: This is character string of the name of the weight variable for women. The sum of the weights should be the number of women in the population.
: This is character string of the name of the weight variable for men. The sum of the weights should be the number of men in the population.
: This is character string of the name of the weight variable for pairs.
rpm
returns an object of class rpm.object
that is a list consisting of the following elements:
coefficients |
The bias-corrected bootstrap estimate of |
loglik |
The value of the maximized log-likelihood. |
exitflag |
integer value with the status of the optimization (4 is success as
|
call |
the call that was made to |
x0 |
vector with starting values for the optimization. |
message |
more informative message with the status of the optimization. |
iterations |
number of iterations that were executed. |
objective |
value if the objective function in the solution. |
solution |
optimal value of the controls. |
version |
version of NLopt that was used. |
covar |
Approximate covariance matrix of the estimates. |
eq |
Values from the equality constraints. Larger values indicate non-convergence. |
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
control.rpm, summary.rpm, print.rpm
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit)
The function rpm
is used to fit a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
The model represents the dyadic utility functions as deterministic linear utility functions of
dyadic variables. These utility functions are functions of observed characteristics of the women
and men.
These functions are entered as terms in the function call
to rpm
. This page describes the possible terms (and hence
linear utility functions) included in rpm
package.
No return value, called for side effects.
Terms to rpm
are specified by a formula to represent the
pairings and covariates This is done via a formula
, that is,
an formula object, of the form ~ <term 1> + <term 2> ...
, where
<term 1>
, <term 2>
, etc, are each terms chosen
from the list given below.
absdiff(attr)
(quantitative attribute),
absdiff(attr)
(quantitative attribute)Absolute difference:
The attr
argument specifies a quantitative attribute
This term adds one statistic to the model equaling
abs(attr[i]-attr[j])
for all women-man dyad (i,j).
W_greaterthan(attr)
Women's value greater than the men's value Adds one statistic indicating if the women's value exceeds the men's value.
M_greaterthan(attr)
Men's value greater than the women's value Adds one statistic indicating if the men's value exceeds the women's value.
W_atleast(attr,threshold=0)
Values
greater than or equal to a threshold Adds one statistic
indicating if the women's value of the attribute equals or exceeds
threshold
.
W_atmost(threshold=0)
Values
less than or equal to a threshold Adds one statistic
indicating if the women's value equals or is exceeded by
threshold
.
W_cov(attr)
(quantitative attribute),
W_cov(attr)
(quantitative attribute)
Main effect of a covariate for women:
The attr
argument specifies a quantitative attribute
This term adds a single statistic equaling the
value of attr(i)
for women in the dyad.
For categorical attributes,
see
W_factor
.
diff(attr)
(quantitative attribute), diff(attr)
(quantitative attribute)Woman's Gap:
The attr
argument specifies a quantitative attribute
This term adds one statistic to the model
being attr[i]-attr[j]
for women and man
.
Specifically, it is the excess of the woman's value over the man's value.
WtoM_diff(attr, diff)
(ordinal categorical attribute), WtoM_diff(attr)
(ordinal categorical discrete attribute)Woman's Gap:
The attr
argument specifies a ordinal categorical attribute
This term adds one statistic to the model
being an indicator that attr[i]=attr[j]+diff
for women and man
.
Specifically, it indicates if the woman's value is
diff
higher than the man's value.
MtoW_diff(attr, diff)
(ordinal categorical attribute), MtoW_diff(attr)
(ordinal categorical discrete attribute)Man's Gap:
The attr
argument specifies a ordinal categorical attribute
This term adds one statistic to the model
being an indicator that attr[j]=attr[i]+diff
for women and man
.
Specifically, it indicates if the man's value is
diff
higher than the woman's value.
MtoW_diff(attr)
(quantitative attribute), MtoW_diff(attr)
(quantitative attribute)Difference:
The attr
argument specifies a quantitative attribute
This term adds one statistic to the model
attr[j]-attr[i]
for women and man
.
W_factor(attr, base=1, levels=-1)
(categorical attribute),
W_factor(attr, base=1, levels=-1)
(categorical attribute)
Factor attribute effect for women:
The attr
argument specifies a categorical attribute
This term adds
multiple statistics to the model, one for each of (a subset of) the
unique values of the attr
attribute. Each of these statistics
indicates if the women's has that attribute.
homophily(attr)
Uniform homophily effect:
The attr
argument specifies a categorical attribute
This term adds one statistic to the model
indicating that the dyad matches on that attribute.
match(attr, diff=FALSE, collapse=NULL)
Attribute-based homophily effect:
The attr
argument specifies a categorical attribute
This term adds one statistic to the model for each categorical level,
unless diff
is set to TRUE
, in which case the term adds multiple
statistics to the model, one for each of (a subset of) the unique values of the attr
attribute.
If diff
is set to TRUE
, the optional argument collapse
control what dyads
are collapsed (or pooled).
Specifically, it is a list of indices of attribute values which are to be collapsed into a
single term. For example, collapse=list(c(1,4))
will collapse the (1,1)
and the
(4,4)
dyads into a single term (and group). Multiple lists can be included with arbitrary numbers of
dyads in a group.
mix(attr, base=NULL, collapse=NULL)
Attribute mixing: The attr
argument specifies a categorical attributes
By default, this term adds one statistic to
the model for each possible pairing of attribute values. The
statistic indicates if the dyad
has that pairing of values.
In other words, this term produces one statistic for
every entry in the mixing matrix for the attribute(s). The ordering of
the attribute values is lexicographic: alphabetical (for nominal categories) or
numerical (for ordered categories).
The optional argument base
control what statistics are
included in the model, specifically it lists the index of the omitted terms (in order).
For example, base=2
omits the second term.
The optional argument collapse
control what dyads are collapsed (or pooled).
Specifically, it is a list of lists. Each element of the list is a list of dyads which are to be collapsed into a
single term. For example, collapse=list(list(c(1,4),c(2,4)))
will collapse the (1,4)
and the
(2,4)
dyads into a single term (and group). Multiple lists can be included with arbitrary numbers of
dyads in a group.
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") summary(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") summary(fit)
rpm.model.matrix
assumes a bipartite network (i.e. two-sided matching market)
It creates a model matrix according to the formula passed in.
See rpm-terms
for a description of the possible terms.
rpm.model.functions(model.terms, control)
rpm.model.functions(model.terms, control)
model.terms |
For the details on the possible countinuous |
control |
A list of control parameters for algorithm tuning. Constructed using
|
A list of model terms as bivariate functions.
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
rpm
# nothing yet
# nothing yet
rpm.model.matrix
assumes a bipartite network (i.e. two-sided matching market)
It creates a model matrix according to the formula passed in.
See rpm-terms
for a description of the possible terms.
rpm.model.matrix(model.terms, Xall, Zall, intercept = TRUE)
rpm.model.matrix(model.terms, Xall, Zall, intercept = TRUE)
model.terms |
For the details on the possible |
Xall |
the unique types of women |
Zall |
the unique types of men |
intercept |
logical; If TRUE, the default, an intercept term is prepended. |
A list consists of the following elements:
X |
the model matrix for women. |
Z |
the model matrix for men. |
Xnames |
the names of the covariates for women. |
Znames |
the names of the covariates for men. |
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
rpm
# nothing yet
# nothing yet
rpmpopulationpmf
computes the probability mass function for a population of the pairs and singles
from a Revealed Preference Matchings Model based on arbitary availability distribution and
preferences. It is typically based on the estimate from a rpm()
call.
rpmpopulationpmf( object, N = 2000, num_women = NULL, num_men = NULL, pmfW = NULL, pmfM = NULL, verbose = FALSE )
rpmpopulationpmf( object, N = 2000, num_women = NULL, num_men = NULL, pmfW = NULL, pmfM = NULL, verbose = FALSE )
object |
list; an object of class |
N |
integer; The total population size. This must be set. The number of women and men are derived from the (weighted) data. |
num_women |
integer; (Optional) The number of women in the population. |
num_men |
integer; (Optional) The number of men in the population. |
pmfW |
vector; (Optional) The population proportions of the numbers of women of each type. This should be compatible with the type in the object. |
pmfM |
vector; (Optional) The population proportions of the numbers of men of each type. This should be compatible with the type in the object. |
verbose |
logical; Should verbose messages be printed out. |
The function rpm
is used to fit a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
The model represents the dyadic utility functions as deterministic linear utility functions of
dyadic variables. These utility functions are functions of observed characteristics of the women
and men.
These functions are entered as terms in the function call
to rpm
. This function simulates from such a model.
A list of data.frame, each a simulation from the population.
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") a <- rpmpopulationpmf(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") a <- rpmpopulationpmf(fit)
simulate.rpm
simulates a population of the pairs and singles
from a Revealed Preference Matchings Model. It is typically based on the estimate from a rpm()
call.
## S3 method for class 'rpm' simulate( object, nsim = 1, seed = NULL, ..., N = NULL, num_women = NULL, num_men = NULL, pmfW = NULL, pmfM = NULL, large.population = TRUE, num_sampled = NULL, bootstrap = FALSE, sampling_design = NULL, control = control.rpm(), verbose = FALSE )
## S3 method for class 'rpm' simulate( object, nsim = 1, seed = NULL, ..., N = NULL, num_women = NULL, num_men = NULL, pmfW = NULL, pmfM = NULL, large.population = TRUE, num_sampled = NULL, bootstrap = FALSE, sampling_design = NULL, control = control.rpm(), verbose = FALSE )
object |
list; an object of class |
nsim |
Number of matchings to be randomly drawn from the given model on the set of all matchings / singles. |
seed |
integer; (Optional) random number seed. |
... |
Additional arguments, to be passed to lower-level functions. |
N |
integer; The total population size. This must be set. The number of women and men are derived from the (weighted) data. |
num_women |
integer; (Optional) The number of women in the population. |
num_men |
integer; (Optional) The number of men in the population. |
pmfW |
vector; (Optional) The population proportions of the numbers of women of each type. This should be compatible with the type in the object. |
pmfM |
vector; (Optional) The population proportions of the numbers of men of each type. This should be compatible with the type in the object. |
large.population |
logical; If TRUE a large population approximation is used to generate
the matchings (rather than the individual level generation of utilities). This is much faster and
uses a lot less memory. It is TRUE by default. If used, a sample is drawn rather than the
population being returned. The sample size is controlled by |
num_sampled |
integer; The size of the sample to be drawn. For "stock-stock" sampling this is the number of sampled households. For "stock-flow" it is the number of sampled people. For "census" it is the total population size, N. If NULL the size is the same as the passed fitted object (that is, the original data), although this is only a guess and it should be explicitly set. |
bootstrap |
logical; If TRUE the original population is sampled from. If FALSE the population underlying the fitted model is sampled from. |
sampling_design |
string; The name of the sampling protocol used to select the survey data. Valid values are
|
control |
A list of control parameters for algorithm tuning. Constructed using
|
verbose |
logical; Should verbose messages be printed out. |
The function rpm
is used to fit a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
It does this using an large-population likelihood based on ideas from Dagsvik (2000), Menzel (2015) and Goyal et al (2023).
The model represents the dyadic utility functions as deterministic linear utility functions of
dyadic variables. These utility functions are functions of observed characteristics of the women
and men.
These functions are entered as terms in the function call
to rpm
. This function simulates from such a model.
A list of data.frame, each a simulation from the population.
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") a <- simulate(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled") a <- simulate(fit)
summary_rpm
produces tabular summaries of data revealed preference matchings
based on a formula specifying a revealed preference model
for men and women of certain
characteristics (or shared characteristics) of people of the opposite sex.
The model assumes a one-to-one stable matching using an observed set of
matchings and a set of (possibly dyadic) covariates to
estimate the parameters for
linear equations of utilities.
summary_rpm( formula, Xdata, Zdata, Xid = NULL, Zid = NULL, pair_id = NULL, X_w = NULL, Z_w = NULL, pair_w = NULL, sampled = NULL, sampling_design = "stock-flow", control = control.rpm(), verbose = FALSE )
summary_rpm( formula, Xdata, Zdata, Xid = NULL, Zid = NULL, pair_id = NULL, X_w = NULL, Z_w = NULL, pair_w = NULL, sampled = NULL, sampling_design = "stock-flow", control = control.rpm(), verbose = FALSE )
formula |
formula; an |
Xdata |
data.frame for women. Each row is a woman, each column is a variable on that women
or her partnerships. It must contain the women's ID variable (see |
Zdata |
data.frame for men. Each row is a man, each column is a variable on that men
It must contain the men's ID variable (see |
Xid |
string The name of the variable in |
Zid |
string The name of the variable in |
pair_id |
string The name of the variable in |
X_w |
string The name of the variable in |
Z_w |
string The name of the variable in |
pair_w |
string The name of the variable in |
sampled |
string The name of the logical variable in |
sampling_design |
string; The name of the sampling protocol used to select the survey data. Valid values are
|
control |
A list of control parameters for algorithm tuning. Constructed using
|
verbose |
logical; if this is |
The pairings are determined by the pair_id
variable in Xdata
.
If that variable is NA then the women is
assumed to be single. If men are listed in Zdata
and are not partnered then they are assumed single.
Weights are specified by three optional variables in Xdata
.
: This is character string of the name of the weight variable for women. The sum of the weights should be the number of women in the population.
: This is character string of the name of the weight variable for men. The sum of the weights should be the number of men in the population.
: This is character string of the name of the weight variable for pairs.
summary
returns a list with many components, like rpm
object without the model estimates. In particular it includes stats
and popstats
.
stats
is the named vector of sample statistics from the model.
while popstats
is the named vector of population statistics from the model.
It alos includes counts
and pmf
. Each of these is a contingency table in array
representation of S3 class c("xtabs", "table")
, with a "call"
Goyal, Shuchi; Handcock, Mark S.; Jackson, Heide M.; Rendall, Michael S. and Yeung, Fiona C. (2023). A Practical Revealed Preference Model for Separating Preferences and Availability Effects in Marriage Formation, Journal of the Royal Statistical Society, A. doi:10.1093/jrsssa/qnad031
Dagsvik, John K. (2000) Aggregation in Matching Markets International Economic Review,, Vol. 41, 27-57. JSTOR: https://www.jstor.org/stable/2648822, doi:10.1111/1468-2354.00054
Menzel, Konrad (2015). Large Matching Markets as Two-Sided Demand Systems Econometrica, Vol. 83, No. 3 (May, 2015), 897-941. doi:10.3982/ECTA12299
control.rpm, summary.rpm, rpm
library(rpm) data(fauxmatching) summary_rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow")
library(rpm) data(fauxmatching) summary_rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow")
[base::summary()] method for [rpm()] fits.
## S3 method for class 'rpm' summary( object, ..., digits = max(3, getOption("digits") - 3), correlation = FALSE, covariance = FALSE, include.single = TRUE ) ## S3 method for class 'summary.rpm' print( x, digits = max(3, getOption("digits") - 3), correlation = FALSE, covariance = FALSE, signif.stars = getOption("show.signif.stars"), eps.Pvalue = 1e-04, print.header = TRUE, print.formula = FALSE, print.fitinfo = TRUE, print.coefmat = TRUE, print.message = TRUE, print.deviances = TRUE, print.drop = TRUE, ... )
## S3 method for class 'rpm' summary( object, ..., digits = max(3, getOption("digits") - 3), correlation = FALSE, covariance = FALSE, include.single = TRUE ) ## S3 method for class 'summary.rpm' print( x, digits = max(3, getOption("digits") - 3), correlation = FALSE, covariance = FALSE, signif.stars = getOption("show.signif.stars"), eps.Pvalue = 1e-04, print.header = TRUE, print.formula = FALSE, print.fitinfo = TRUE, print.coefmat = TRUE, print.message = TRUE, print.deviances = TRUE, print.drop = TRUE, ... )
object |
an object of class |
... |
For [summary.rpm()] additional arguments are passed to [logLik.rpm()]. For [print.summary.rpm()], to [stats::printCoefmat()]. |
digits |
significant digits for coefficients. The default is max(3, getOption("digits")-3). |
correlation |
logical whether the correlation matrix of the estimated parameters should be printed (T or F); default=FALSE |
covariance |
logical whether the covariance matrix of the estimated parameters should be printed (T or F); default=FALSE |
include.single |
logical; if 'TRUE', include in the summary table the coefficients of the log-odds of being single for each category of women and men. |
x |
object of class 'summary.rpm' returned by [summary.rpm()]. |
signif.stars |
whether to print dots and stars to signify statistical significance. See [print.summary.lm()]. |
eps.Pvalue |
|
print.formula , print.fitinfo , print.coefmat , print.message , print.deviances , print.drop , print.header
|
which components of the fit summary to print. |
[summary.rpm()] tries to be smart about formatting the coefficients, standard errors, etc.
The default printout of the summary object contains the call, number of iterations used, null and residual deviances, and the values of AIC and BIC. The coefficient table contains the following columns:
- 'Estimate', 'Std. Error' - parameter estimates and their standard errors - 'z value', 'Pr(>|z|)' - z-test and p-values
The function [summary.rpm()] computes and returns a list of summary statistics of the fitted [rpm()] model given in 'object'. Note that for backwards compatibility, it returns the coefficient table.
The returned object is a list of class "summary.rpm" with the following elements:
formula |
ERGM model formula |
digits |
the 'digits' inputted to <summary.rpm> or the default value (despite the fact the digits will be 5) |
correlation , covariance
|
whether to print correlation/covariance matrices of the estimated parameters |
iterations |
object$iterations |
control |
the [control.rpm()] object used |
samplesize |
MCMC sample size |
message |
optional message on the validity of the standard error estimates |
aic.null , bic.null
|
values of AIC and BIC for the null model |
aic , bic
|
values of AIC and BIC |
coefficients |
data frames with model parameters and associated statistics |
asycov |
asymptotic covariance matrix |
asyse |
asymptotic standard error matrix |
offset , drop , estimate , iterations , mle.lik , null.lik
|
see documentation of the object returned by [rpm()] |
The model fitting function [rpm()], [print.rpm()], and [base::summary()]. Function [stats::coef()] will extract the data frame of coefficients with standard errors, t-statistics and p-values.
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit)
library(rpm) data(fauxmatching) fit <- rpm(~match("edu") + WtoM_diff("edu",3), Xdata=fauxmatching$Xdata, Zdata=fauxmatching$Zdata, X_w="X_w", Z_w="Z_w", pair_w="pair_w", pair_id="pair_id", Xid="pid", Zid="pid", sampled="sampled",sampling_design="stock-flow") summary(fit)
Extract or replace the *ult*imate (last) element of a vector or a list, or an element counting from the end.
ult(x, i = 1L) <- value ult(x, i = 1L)
ult(x, i = 1L) <- value ult(x, i = 1L)
x |
a vector or a list. |
i |
index from the end of the list to extract or replace (where 1 is the last element, 2 is the penultimate element, etc.). |
value |
Replacement value for the 'i'th element from the end. |
An element of 'x'.
Due to the way in which assigning to a function is implemented in R, 'ult(x) <- e' may be less efficient than 'x[[length(x)]] <- e'.
(x <- c(1:5)) (ult(x) <- 6) (ult(x, 2) <- 7) # 2nd last. x x <- 1:5 (last <- ult(x)) (penultimate <- ult(x, 2)) # 2nd last.
(x <- c(1:5)) (ult(x) <- 6) (ult(x, 2) <- 7) # 2nd last. x x <- 1:5 (last <- ult(x)) (penultimate <- ult(x, 2)) # 2nd last.