Anonymous | Login | Signup for a new account | 2024-11-25 02:26 MSK |
Main | My View | View Issues | Change Log | Roadmap | Docs |
Viewing Issue Advanced Details [ Jump to Notes ] | [ View Simple ] [ Issue History ] [ Print ] | |||||||||||
ID | Category | Severity | Reproducibility | Date Submitted | Last Update | |||||||
0000525 | [ALGLIB] Data analysis | feature | have not tried | 2013-06-20 12:24 | 2014-06-03 16:31 | |||||||
Reporter | SergeyB | View Status | public | |||||||||
Assigned To | SergeyB | |||||||||||
Priority | normal | Resolution | open | Platform | ||||||||
Status | assigned | OS | ||||||||||
Projection | none | OS Version | ||||||||||
ETA | none | Fixed in Version | Product Version | |||||||||
Target Version | Next 'Data mining' release | Product Build | ||||||||||
Summary | 0000525: Neural network improvements | |||||||||||
Description |
Neural structure: * layered * complex interactions between layers * several activation functions: tanh(), tanh()+linear, fast sigmoid SGD: * algorithm without learning rates: http://yann.lecun.com/exdb/publis/pdf/schaul-icml-13.pdf * page 72 of http://learning.stat.purdue.edu/mlss/_media/mlss/bottou.pdf - important trick * important info on BP acceleration http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf * ADADELTA seems to be best method - http://www.matthewzeiler.com/pubs/googleTR2012/googleTR2012.pdf * ADAGRAD??? * "On the importance of initialization and momentum in deep learning", http://jmlr.org/proceedings/papers/v28/sutskever13.pdf Improvements: * shortcut layer, see "Deep Learning Made Easier by Linear Transformations in Perceptrons". Maybe - pre-training linear layer separately. Decide on: * minibatch training * bagging for ensembles * parallel errors for ensembles * sparse errors for ensembles * subset errors for ensembles * decay in ensemble training? * investigate ensemble tendency to overfit on GLASS dataset * mini-batch LBFGS training * approximate Hessian preconditioning * FindBestDecay * FindBestNetwork * "ensemble selection", better way of constructing ensemble * model compression * sparse autoencoders? * stacked autoencoders/autodecoders? Convolutional Neural Networks (weight sharing = constraints and projections): * http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf |
|||||||||||
Steps To Reproduce | ||||||||||||
Additional Information | ||||||||||||
Programming language | Unspecified | |||||||||||
Attached Files | ||||||||||||
|
There are no notes attached to this issue. |
Mantis 1.1.6[^] Copyright © 2000 - 2008 Mantis Group |