Skip to content

Logit regression

eoudejans edited this page Jul 23, 2024 · 8 revisions

If one assumes that the probability is P ( i j ) so that actor i = 1. . n chooses alternative j = 1. . k is proportional (within the set of alternative choices) to the exponent of a linear combination of p = 1. . p data values X i j p related to i and j , one arrives at the logit model, or more formally:

Assume P ( i j ) w i j w i j := e x p ( v i j ) v i j := p β p X i j p

Thus L ( i j ) := l o g ( P ( i j ) ) v i j .

Consequently, w i j > 0 and P ( i j ) := w i j j w i j , since j P i j must be 1 .

Note that:

  • v i j is a linear combination of X i j p with weights β p as logit model parameters.
  • the odds ratio P ( i j ) P ( i j ) of choice j against alternative j is equal to w i j w i j = e x p ( v i j v i j ) = e x p p β p ( X i j p X i j p )
  • this formulation does not require a separate beta index (aka parameter space dimension) per alternative choice j for each exogenous variable.

observed data

Observed choices Y i j are assumed to be drawn from a repreated Bernoulli experiment with probabilites P ( i j ) .

Thus P ( Y ) = i j N i ! × P ( i j ) Y i j Y i j ! with N i := j Y i j .

Thus L ( Y ) := l o g ( P ( Y ) )

= l o g i j N i ! × P ( i j ) Y i j Y i j !

= C + i j ( Y i j × l o g ( P i j ) )

= C + i [ j Y i j × L ( i j ) ]

= C + i [ j Y i j × ( v i j l o g j w i j ) ]

= C + i [ ( j Y i j × v i j ) N i × l o g j w i j ]

with C = i C i and C i := [ l o g ( N i ! ) j l o g ( Y i j ! ) ] , which is independent of P i j and β j . Note that: N i = 1 C i = 0

specification

The presented form v i j := β p × X i j p (using Einstein Notation from here) is more generic than known implementations of logistic regression (such as in SPSS and R), where X i q , a set of q = 1. . q data values given for each i ( X i 0 ) is set to 1 to represent the incident for each j ) and ( k 1 ) × ( q + 1 ) parameters are to be estimated, thus v i j := β j q × X i q for j = 2. . k which requires a different beta for each alternative choice and data set, causing unnecessary large parameter space.

The latter specification can be reduced to the more generic form by:

  • assigning a unique p to each j q combination, represented by A j q p .
  • defining X i j p := A j q p × X i q for j = 2. . k , thus creating redundant and zero data values.

However, a generical model cannot be reduced to a specification with different β 's for each alternative choice unless the latter parameter space can be restricted to contain no more dimensions than a generic form. With large n and k , the data values X i j k can be huge. To mitigate the data size, the following tricks can be applied:

  • limit the set of combinations of i and j to the most probable or near j 's for each i and/or cluster the other j 's.
  • use only a sample from the set of possible i 's.
  • support specific forms of data:
# form reduction description
0 β p X i j p general form of p factors specific for each i and j
1 β p A j q p X i q X i j p := A j q p X i q q factors that vary with i but not with j.
2 β p X i p X j p X i j p := X j p X i p p specific factors in simple multiplicative form
3 β j q X i q q factors that vary with j but not with i.
4 β p X j p X i j p := X j p state constants Dj
5 β j state dependent intercept
6 β p ( J i p == j ) usage of a recorded preference

regression

The β p 's are found by maximizing the likelihood L ( Y | β ) which is equivalent to finding the maximum of i [ j Y i j × v i j N i × l o g j w i j ]

First order conditions, for each p : 0 = L β p = i [ j Y i j × v i j β p N i × l o g j w i j β p ]

Thus, for each p : i j Y i j × X i j p = i j N i × P i j × X i j p as v i j β p = X i j p and

( l o g j w i j β p × j w i j / β p j w i j × j w i j × v i j / β p j w i j × j w i j × X i j p j w i j × j P i j × X i j p )

example

logit regression of rehousing logit_regression_of_rehousing "wikilink".

Clone this wiki locally