idlogit is a python "package" for estimating "idLogit" models, or Logit models with Idiosyncratic Deviations. The idLogit is a non-parametric model of choice heterogeneity with a convex maximum likelihood estimation problem.
See this article for methodological details.
As usual, do pip install idlogit. This package requires numpy, scipy, and ecos.
The most basic call is
x , info = idlogit( K , I , N , y , X , ind )
where
K(integer) is the number of model featuresI(integer) is the number of individuals in the observationsN(integer) is the number of observationsy(numpy.array) is a N-vector of (binary) choices, coded as +/- 1X(numpy.array or scipy.sparse) is a NxK-matrix of observation-specific features (dense or sparse)ind(list or numpy.array) is a N-vector of observation-individual assignments in {1,...,I}
and
x(numpy.array) is a K-vector of estimated coefficientsinfo(numpy.array) is the ECOS information structure resulting from the solve attempt
There are, of course, options we cover below. If a sparse X matrix is passed, it is internally transformed into a scipy.sparse.coo_matrix before use. If a dense X matrix is passed, it is not processed as a sparse matrix; that is to say, idlogit presumes all of X's entries are nonzero. If this is not the case (for example, you have hard-coded dummies in the data) using a sparse matrix may be much better.
Options that can currently be passed:
constant(boolean) Include a constant in the model, or not. The returnedxwill beK+1ifTrue, with the first element being the estimated parameter corresponding to the constant.outopt(boolean) Is there an "outside good", "outside option", or no-choice option?Lambdas(list) A 2-element list of L1 and L2 penalty parameter values (respectively).bin(list) A list of indices from 1,...,K that identify which variables inXare binary (0/1). Indices must be mutually exclusive withcat. Binary variables are encoded with a single dummy equal to 1 for any "truthy" value inX. Variables not inbinorcatare interpreted as numerical and not transformed.cat(list) A list of indices from 1,...,K that identify which are categorical (finite, with level-specific coefficients). Indices must be mutually exclusive withbin. Categorical variables are analyzed for their cardinality and subsequently "expanded" into level-dummies whose coefficients are constrained to sum to zero for identification. Variables not inbinorcatare interpreted as numerical and not transformed.prints(dict) A dictionary of prints of the ECOS data created (for debugging, really). Valid keys arestart,costs,lineq,lerhs,cones,ccrhs, and valid values are booleans (or anything "truthy").
as well as any options for ecos-python passed directly to ECOS as **kwargs.
This code solves problems of the general form
min 1/N sum_n log( 1 + exp{ -y_n x_n'( b + d_{i(n)} ) } ) + L1/N || d ||_1 + L2/2N || d ||_2
wrt b , d_1 , ... , d_I in Real(K)
sto d_1 + ... + d_I = 0
The solve is done by transforming this problem into an equivalent Exponential Cone Programming problem that can be passed to the ECOS solver.