Logit regression

If one assumes that the probability is $P (i \to j)$ so that actor $i = 1. . n$ chooses alternative $j = 1. . k$ is proportional (within the set of alternative choices) to the exponent of a linear combination of $p = 1. . p$ data values $X_{i j p}$ related to $i$ and $j$ , one arrives at the logit model, or more formally:

Assume $P (i \to j) \sim w_{i j} w_{i j} := e x p (v_{i j}) v_{i j} := \sum_{p} β_{p} X_{i j p}$

Thus $L (i \to j) := l o g (P (i \to j)) \sim v_{i j}$ .

Consequently, $w_{i j} > 0$ and $P (i \to j) := \frac{w_{i j}}{\sum_{j^{'}} w_{i j^{'}}}$ , since $\sum_{j} P_{i j}$ must be $1$ .

Note that:

$v_{i j}$ is a linear combination of $X_{i j p}$ with weights $β_{p}$ as logit model parameters.
the odds ratio $\frac{P (i \to j)}{P (i \to j^{'})}$ of choice $j$ against alternative $j'$ is equal to $\frac{w_{i j}}{w_{i j^{'}}} = e x p (v_{i j} - v_{i j^{'}}) = e x p \sum_{p} β_{p} (X_{i j p} - X_{i j^{'} p})$
this formulation does not require a separate beta index (aka parameter space dimension) per alternative choice $j$ for each exogenous variable.

observed data

Observed choices $Y_{i j}$ are assumed to be drawn from a repreated Bernoulli experiment with probabilites $P (i \to j)$ .

Thus $P (Y) = \prod_{i j} \frac{N_{i}! \times P (i \to j)^{Y_{i j}}}{Y_{i j}!}$ with $N_{i} := \sum_{j} Y_{i j}$ .

Thus $L (Y) := l o g (P (Y))$

$= l o g \prod_{i j} \frac{N_{i}! \times P (i \to j)^{Y_{i j}}}{Y_{i j}!}$

$= C + \sum_{i j} (Y_{i j} \times l o g (P_{i j}))$

$= C + \sum_{i} [\sum_{j} Y_{i j} \times L (i \to j)]$

$= C + \sum_{i} [\sum_{j} Y_{i j} \times (v_{i j} - l o g \sum_{j^{'}} w_{i j^{'}})]$

$= C + \sum_{i} [(\sum_{j} Y_{i j} \times v_{i j}) - N_{i} \times l o g \sum_{j} w_{i j}]$

with $C = \sum_{i} C_{i}$ and $C_{i} := [l o g (N_{i}!) - \sum_{j} l o g (Y_{i j}!)]$ , which is independent of $P_{i j}$ and $β_{j}$ . Note that: $N_{i} = 1 ⟹ C_{i} = 0$

specification

The presented form $v_{i j} := β_{p} \times X_{i j}^{p}$ (using Einstein Notation from here) is more generic than known implementations of logistic regression (such as in SPSS and R), where $X_{i}^{q}$ , a set of $q = 1. . q$ data values given for each $i$ ( $X_{i}^{0})$ is set to $1$ to represent the incident for each $j$ ) and $(k - 1) \times (q + 1)$ parameters are to be estimated, thus $v_{i j} := β_{j q} \times X_{i}^{q}$ for $j = 2. . k$ which requires a different beta for each alternative choice and data set, causing unnecessary large parameter space.

The latter specification can be reduced to the more generic form by:

assigning a unique $p$ to each $j q$ combination, represented by $A_{j q}^{p}$ .
defining $X_{i j}^{p} := A_{j q}^{p} \times X_{i}^{q}$ for $j = 2. . k$ , thus creating redundant and zero data values.

However, a generical model cannot be reduced to a specification with different $β$ 's for each alternative choice unless the latter parameter space can be restricted to contain no more dimensions than a generic form. With large $n$ and $k$ , the data values $X_{i j k}$ can be huge. To mitigate the data size, the following tricks can be applied:

limit the set of combinations of $i$ and $j$ to the most probable or near $j$ 's for each $i$ and/or cluster the other $j$ 's.
use only a sample from the set of possible $i$ 's.
support specific forms of data:

#	form	reduction	description
0	$β_{p} X_{i j}^{p}$		general form of p factors specific for each i and j
1	$β_{p} A_{j q}^{p} X_{i}^{q}$	$X_{i j}^{p} := A_{j q}^{p} X_{i}^{q}$	q factors that vary with i but not with j.
2	$β_{p} X_{i}^{p} X_{j}^{p}$	$X_{i j}^{p} := X_{j}^{p} X_{i}^{p}$	p specific factors in simple multiplicative form
3	$β_{j q} X_{i}^{q}$		q factors that vary with j but not with i.
4	$β_{p} X_{j}^{p}$	$X_{i j}^{p} := X_{j}^{p}$	state constants D_j
5	$β_{j}$		state dependent intercept
6	$β_{p} (J_{i}^{p} == j)$		usage of a recorded preference

regression

The $β_{p}$ 's are found by maximizing the likelihood $L (Y | β)$ which is equivalent to finding the maximum of $\sum_{i} [\sum_{j} Y_{i j} \times v_{i j} - N_{i} \times l o g \sum_{j} w_{i j}]$

First order conditions, for each $p$ : $0 = \frac{\partial L}{\partial β_{p}} = \sum_{i} [\sum_{j} Y_{i j} \times \frac{\partial v_{i j}}{\partial β_{p}} - N_{i} \times \frac{\partial l o g \sum_{j} w_{i j}}{\partial β_{p}}]$

Thus, for each $p$ : $\sum_{i j} Y_{i j} \times X_{i j p} = \sum_{i j} N_{i} \times P_{i j} \times X_{i j p}$ as $\frac{\partial v_{i j}}{\partial β_{p}} = X_{i j}^{p}$ and

$(\frac{\partial l o g \sum_{j} w_{i j}}{\partial β_{p}} \times \frac{\sum_{j} \partial w_{i j} / \partial β_{p}}{\sum_{j} w_{i j}} \times \frac{\sum_{j} w_{i j} \times \partial v_{i j} / \partial β_{p}}{\sum_{j} w_{i j}} \times \frac{\sum_{j} w_{i j} \times X_{i j p}}{\sum_{j} w_{i j}} \times \sum_{j} P_{i j} \times X_{i j p})$

example

logit regression of rehousing logit_regression_of_rehousing "wikilink".

GeoDMS

Software
User Guide
How To Model
Value Type
Configuration examples
Operators and Functions
- Arithmetic
- Ordering
- Aggregation
- Conversion
- Classify
- Transcendental
- Predicates
- Logical
- Relational
- Selection
- Rescale
- Constant
- Trigonometric
- Geometric
- Network
- Grid
- String
- File, Folder and Read
- Unit
- Matrix
- Sequence
- MetaScript
- Allocation
- Miscellaneous
Annex

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logit regression

observed data

specification

regression

example

Clone this wiki locally