DARTS

2019-ICLR-DARTS Differentiable Architecture Search

  • Hanxiao Liu、Karen Simonyan、Yiming Yang
  • GitHub:2.8k stars
  • Citation:557

Motivation

Current NAS method:

  • Computationally expensive: 2000/3000 GPU days
  • Discrete search space, leads to a large number of architecture evaluations required.

Contribution

  • Differentiable NAS method based on gradient decent.
  • Both CNN(CV) and RNN(NLP).
  • SOTA results on CIFAR-10 and PTB.
  • Efficiency: (2000 GPU days VS 4 GPU days)
  • Transferable: cifar10 to ImageNet, (PTB to WikiText-2).

Method

Search Space

Optimization Target

Our goal is to jointly learn the architecture α and the weights w within all the mixed operations (e.g. weights of the convolution filters).

Discrete Arch

Experiments

Arch Evaluation

Result Analysis

Conclusion

  • We presented DARTS, a simple yet efficient NAS algorithm for both CNN and RNN.
  • SOTA
  • efficiency improvement by several orders of magnitude.

Improve

  • discrepancies between the continuous architecture encoding and the derived discrete architecture. (softmax…)
  • It would also be interesting to investigate performance-aware architecture derivation schemes based on the shared parameters learned during the search process.

Appendix

05-11 16:12