Inaccurate porting of covariance vs naive method

https://github.com/civisanalytics/python-glmnet/blob/813c06f5fcc9604d8e445bd4992f53c4855cc7cb/glmnet/linear.py#L288-L293

`glmnet` actually does a slightly different check than just a "n" vs "p" comparison like this. It invokes method 1 (covariance method) if p <= 500. The covariance method keeps track of a matrix of covariances C(i,j) for every feature i and every _active_ feature j. And under the hood, C is allocated as a pxp matrix (even though we use much less memory than that usually); this was done out of simplicity because it's very hard to write clever data structures in Fortran. So even when n >> p, if p is also large, this is not a viable default option on most machines.

Anyways, I'd suggest changing to
```python
if X.shape[1] <= 500:
    algo_flag = 1
else:
    algo_flag = 2
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inaccurate porting of covariance vs naive method #82

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

	if X.shape[1] > X.shape[0]:
	# the glmnet docs suggest using a different algorithm for the case
	# of p >> n
	algo_flag = 2
	else:
	algo_flag = 1

Inaccurate porting of covariance vs naive method #82

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions