-
Notifications
You must be signed in to change notification settings - Fork 29
Failure on MacOS for Julia 1.3, 1.4: Intel MKL Error #212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Adding some considerations: commenting out the clustering tests is not enough, the regression and classification tests also fail horribly; there's something seriously wrong here.
and more of the same I'm wondering whether there's not a clash between Conda versions somehow |
We should compare
cc @OkonSamuel |
Last commit ok: 61febaa on Feb 26 First commit nok: d9944ca on Feb 28 List of differences: Maybe relevant
Probably irrelevant
|
The openspecfun calls the compilersupportlibs, note that this change was made on February 26 which is also the date where we had our last working push... let's try putting a compat bound on that |
Ok this is almost certainly the culprit, you can see that the compat of that thing is Julia 1.3 which would also explain why it fails on 1.3/1.4... now where the heck is this loaded |
Steps to fix the problem:
in your terminal, go wherever that is, for me
Log (@v1.5) pkg> activate .
s Activating environment at `~/.julia/dev/MLJModels/Project.toml`
(MLJModels) pkg> status
Project MLJModels v0.9.3
Status `~/.julia/dev/MLJModels/Project.toml`
324d7699 CategoricalArrays v0.7.7
b4f34e82 Distances v0.8.2
31c24e10 Distributions v0.22.6
a7f614a8 MLJBase v0.12.3 `~/.julia/dev/MLJBase`
e80e1ace MLJModelInterface v0.2.1 `../MLJModelInterface`
6f286f6a MultivariateStats v0.7.0
efe28fd5 OpenSpecFun_jll v0.5.3+1 `~/.julia/dev/OpenSpecFun_jll`
bac558e1 OrderedCollections v1.1.0
d96e819e Parameters v0.12.0
ae029012 Requires v1.0.1
321657f4 ScientificTypes v0.7.1 `../ScientificTypes`
2913bbd2 StatsBase v0.32.2
bd369af6 Tables v1.0.3
b77e0a4c InteractiveUtils
37e2e46d LinearAlgebra
44cfe95a Pkg
9a3f8284 Random
10745b16 Statistics
[ Info: Precompiling MLJModels [d491faf4-2d78-11e9-2867-c94bc002c0b7]
[ Info: Model metadata loaded from registry.
[ Info: Precompiling RDatasets [ce6b1742-4840-55fa-b093-852dadbb1d8b]
[ Info: Training Machine{AffinityPropagation} @ 1…25.
Test Summary: | Pass Total
AffinityPropagation | 5 5
[ Info: Training Machine{AgglomerativeClustering} @ 1…41.
Test Summary: | Pass Total
AgglomerativeClustering | 4 4
[ Info: Training Machine{Birch} @ 9…06.
Test Summary: | Pass Total
Birch | 8 8
[ Info: Training Machine{DBSCAN} @ 9…63.
Test Summary: | Pass Total
DBSCAN | 4 4
[ Info: Training Machine{FeatureAgglomeration} @ 1…75.
Test Summary: | Pass Total
FeatureAgglomeration | 7 7
[ Info: Training Machine{KMeans} @ 2…15.
Test Summary: | Pass Total
KMeans | 6 6
[ Info: Training Machine{MiniBatchKMeans} @ 9…15.
Test Summary: | Pass Total
MBKMeans | 6 6
[ Info: Training Machine{MeanShift} @ 9…80.
Test Summary: | Pass Total
MeanShift | 5 5
[ Info: Training Machine{OPTICS} @ 5…86.
Test Summary: | Pass Total
OPTICS | 4 4
[ Info: Training Machine{SpectralClustering} @ 1…33.
Test Summary: | Pass Total
SpectralClustering | 4 4
Pass Total
SVM-infos | 7 7
Test Summary: | Pass Total
SVCs | 3 3
Test Summary: | Pass Total
SVRs | 3 3
Test Summary: | Pass Total
ARD | 3 3
Test Summary: | Pass Total
BayesianRidge | 3 3
Test Summary: | Pass Total
ElasticNet | 3 3
Test Summary: | Pass Total
ElasticNetCV | 3 3
Test Summary: | Pass Total
Huber | 3 3
Test Summary: | Pass Total
Lars | 3 3
Test Summary: | Pass Total
LarsCV | 3 3
Test Summary: | Pass Total
Lasso | 3 3
Test Summary: | Pass Total
LassoCV | 3 3
Test Summary: | Pass Total
LassoLars | 3 3
Test Summary: | Pass Total
LassoLarsCV | 3 3
Test Summary: | Pass Total
LassoLarsIC | 3 3
Test Summary: | Pass Total
LinReg | 5 5
Test Summary: | Pass Total
OMP | 3 3
Test Summary: | Pass Total
OMPCV | 3 3
Test Summary: | Pass Total
OMPCV | 3 3
Test Summary: | Pass Total
PassAggr | 3 3
Test Summary: | Pass Total
RANSAC | 3 3
Test Summary: | Pass Total
Ridge | 3 3
Test Summary: | Pass Total
RidgeCV | 4 4
Test Summary: | Pass Total
SGDReg | 3 3
Test Summary: | Pass Total
TheilSen | 3 3
Test Summary: | Pass Total
MTLassoCV | 4 4
Test Summary: | Pass Total
MTLassoCV | 3 3
Test Summary: | Pass Total
MTElNet | 3 3
Test Summary: | Pass Total
MTElNetCV | 3 3
Test Summary: | Pass Total
LogRegClf | 8 8
Test Summary: | Pass Total
LogRegCVClf | 8 8
Test Summary: | Pass Total
PAClf | 7 7
Test Summary: | Pass Total
PerceptronClf | 7 7
Test Summary: | Pass Total
RidgeClf | 7 7
Test Summary: | Pass Total
RidgeCVClf | 7 7
Test Summary: | Pass Total
SGDClf | 13 13
Test Summary: | Pass Total
GPRegressor | 3 3
Test Summary: | Pass Total
GPClassif | 8 8
Test Summary: | Pass Total
AdaBoostReg | 3 3
Test Summary: | Pass Total
BaggingReg | 3 3
Test Summary: | Pass Total
XTreeReg | 3 3
Test Summary: | Pass Total
GBReg | 3 3
Test Summary: | Pass Total
RFReg | 3 3
Test Summary: | Pass Total
AdaboostClf | 8 8
Test Summary: | Pass Total
BaggingClf | 8 8
Test Summary: | Pass Total
GradBoostClf | 8 8
Test Summary: | Pass Total
RForestClf | 8 8
Test Summary: | Pass Total
XTreeClf | 8 8
Test Summary: | Pass Total
LDA | 8 8
Test Summary: | Pass Total
QDA | 8 8
Test Summary: | Pass Total
DummyReg | 4 4
Test Summary: | Pass Total
DummyClf | 7 7
Test Summary: | Pass Total
GaussianNBClf | 8 8
Test Summary: | Pass Total
KNNReg | 5 5
Test Summary: | Pass Total
KNNClf | 8 8
Test Summary: | Pass Total
BernNBClf | 7 7
Test Summary: | Pass Total
MultiNBClf | 7 7
Test Summary: | Pass Total
ComplNBClf | 7 7 This is on a Mac with v1.5 |
@tlienart looking at the METADATA file it seems OpenSpecFun_jll.jl is a depency only used by Distributions.jl and SpecialFunctions.jl |
I think possibly RMath? But either way Elliott has commented on a similar issue in the BinaryBuilder package, seems like |
I tried to reproduce this locally, and I haven't been able to so far. I installed |
Thanks for trying this Elliott, it may well be that the error now only shows on Travis (cf link in the other issue) though I’m a bit surprised if that’s really the case. My environment is far from pristine but I’ll try creating one from scratch and testing on my side. Either way I think it would be good to make sure we can control this so that no one encounters undue segfaults ! Thanks again! |
@staticfloat using MLJ
X, y = @load_boston;
train, test = partition(eachindex(y), .7, rng=333);
@load LGBMRegressor
mdl = LGBMRegressor()
mach = machine(mdl, X, y)
fit!(mach, rows=train)
@load ARDRegressor
mdl = ARDRegressor()
mach = machine(mdl, X, y)
fit!(mach, rows=train)
PyError ($(Expr(:escape, :(ccall(#= /Users/AZevelev/.julia/packages/PyCall/zqDXB/src/pyfncall.jl:43 =# @pysym(:PyObject_Call), PyPtr, (PyPtr, PyPtr, PyPtr), o, pyargsptr, kw))))) <class 'numpy.linalg.LinAlgError'>
LinAlgError('unrecoverable internal error.')
File "/Users/AZevelev/.julia/conda/3/lib/python3.7/site-packages/sklearn/linear_model/_bayes.py", line 577, in fit
sigma_ = update_sigma(X, alpha_, lambda_, keep_lambda, n_samples)
File "/Users/AZevelev/.julia/conda/3/lib/python3.7/site-packages/sklearn/linear_model/_bayes.py", line 562, in update_sigma
X[:, keep_lambda].T))
File "/Users/AZevelev/.julia/conda/3/lib/python3.7/site-packages/sklearn/externals/_scipy_linalg.py", line 99, in pinvh
s, u = decomp.eigh(a, lower=lower, check_finite=False)
File "/Users/AZevelev/.julia/conda/3/lib/python3.7/site-packages/scipy/linalg/decomp.py", line 474, in eigh
raise LinAlgError("unrecoverable internal error.")
pyerr_check at exception.jl:60 [inlined]
pyerr_check at exception.jl:64 [inlined]
_handle_error(::String) at exception.jl:81
macro expansion at exception.jl:95 [inlined]
#110 at pyfncall.jl:43 [inlined]
disable_sigint at c.jl:446 [inlined]
__pycall! at pyfncall.jl:42 [inlined]
_pycall!(::PyCall.PyObject, ::PyCall.PyObject, ::Tuple{Array{Float64,2},Array{Float64,1}}, ::Int64, ::Ptr{Nothing}) at pyfncall.jl:29
_pycall!(::PyCall.PyObject, ::PyCall.PyObject, ::Tuple{Array{Float64,2},Array{Float64,1}}, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at pyfncall.jl:11
(::PyCall.PyObject)(::Array{Float64,2}, ::Vararg{Any,N} where N; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at pyfncall.jl:86
(::PyCall.PyObject)(::Array{Float64,2}, ::Vararg{Any,N} where N) at pyfncall.jl:86
fit!(::PyCall.PyObject, ::Array{Float64,2}, ::Vararg{Any,N} where N; kwargs::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}) at Skcore.jl:100
fit!(::PyCall.PyObject, ::Array{Float64,2}, ::Array{Float64,1}) at Skcore.jl:100
fit(::ARDRegressor, ::Int64, ::NamedTuple{(:Crim, :Zn, :Indus, :NOx, :Rm, :Age, :Dis, :Rad, :Tax, :PTRatio, :Black, :LStat),NTuple{12,Array{Float64,1}}}, ::Array{Float64,1}) at ScikitLearn.jl:157
fit!(::Machine{ARDRegressor}; rows::Array{Int64,1}, verbosity::Int64, force::Bool) at machines.jl:183
(::StatsBase.var"#fit!##kw")(::NamedTuple{(:rows,),Tuple{Array{Int64,1}}}, ::typeof(fit!), ::Machine{ARDRegressor}) at machines.jl:146
top-level scope at MLJ_FitMachine_Error.jl:18 |
Okay, I no longer fail on Sorry I don't have 1.4 installed just now. Can check later |
So it's only CI that fails this maybe due to the way that it installs SkLearn @staticfloat what do you think of the travis log here: https://travis-ci.com/github/alan-turing-institute/MLJModels.jl/jobs/320357986 ?
|
This well-known issue not related to MLJ is being tracked. So closing |
Further to #211 :
I have introduce CI on this branch for julia 1.3 and julia 1.4, where testing in now failing for MacOS.
The error is triggered by testing of the wrapped scikit-learn clustering models. According to the travis logs, the conda installations for scikit-learn are the same for linux and macOS, excect that macOS has an additional package llvm-openmp-4.0.1 installed.
Any help at all on this one would be appreciated. In particular, should I regard this as julia 1.3/1.4 error?
Here is the tail of the stack trace
The text was updated successfully, but these errors were encountered: