Abstract
Deep learning has made substantial breakthroughs in many fields due to its powerful automatic representation capabilities. It has been proven that neural architecture design is crucial to the feature representation of data and the final performance. However, the design of the neural architecture heavily relies on the researchers’ prior knowledge and experience. And due to the limitations of humans’ inherent knowledge, it is difficult for people to jump out of their original thinking paradigm and design an optimal model. Therefore, an intuitive idea would be to reduce human intervention as much as possible and let the algorithm automatically design the neural architecture. Neural Architecture Search (NAS) is just such a revolutionary algorithm, and the related research work is complicated and rich. Therefore, a comprehensive and systematic survey on the NAS is essential. Previously related surveys have begun to classify existing work mainly based on the key components of NAS: search space, search strategy, and evaluation strategy. While this classification method is more intuitive, it is difficult for readers to grasp the challenges and the landmark work involved. Therefore, in this survey, we provide a new perspective: beginning with an overview of the characteristics of the earliest NAS algorithms, summarizing the problems in these early NAS algorithms, and then providing solutions for subsequent related research work. In addition, we conduct a detailed and comprehensive analysis, comparison, and summary of these works. Finally, we provide some possible future research directions.
- S. Hochreiter and J. Schmidhuber. 1997. Lonort-term memory. Neural Computation 9, 8 (1997), 1735–1780.Google ScholarDigital Library
- M. X. Chen, O. Firat, A. Bapna, M. Johnson, W. Macherey, G. Foster, L. Jones, N. Parmar, M. Schuster, Z. Chen, Y. Wu, and M. Hughes. 2018. The best of both worlds: Combining recent advances in neural machine translation. arXiv:1804.09849. Retrieved from https://arxiv.org/pdf/1804.09849.pdf.Google Scholar
- Y. Wu, M. Schuster, Z. Chen, Q. V. Le, M. Norouzi, W. Macherey, M. Krikun, Y. Cao, Q. Gao, K. Macherey et al. 2016. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv:1609.08144. https://arxiv.org/pdf/1609.08144.pdf.Google Scholar
- K. Simonyan and A. Zisserman. 2015. Very deep convolutional networks for large-scale image recognition. ICLR.Google Scholar
- M. Suganuma, S. Shirakawa, and T. Nagao. 2017. A genetic programming approach to designing convolutional neural network architectures. In Proceedings of the Genetic and Evolutionary Computation Conference. 497–504.Google Scholar
- Andrew G. Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. 2017. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861. https://arxiv.org/pdf/1704.04861.pdf.Google Scholar
- A. Krizhevsky, I. Sutskever, and G. E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems. 1097–1105.Google Scholar
- S. Ren, K. He, R. Girshick, and J. Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems. 91–99.Google Scholar
- W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg. 2016. SSD: Single shot multibox detector. In European Conference on Computer Vision. Springer, Cham., 21–37.Google Scholar
- T. Y. Lin, P. Goyal, R. Girshick, K. He, and Dollár, P. 2017. Focal loss for dense object detection. In Proceedings of the IEEE International Conference on Computer Vision. 2980–2988.Google Scholar
- B. Zoph and Q. V. Le. 2017. Neural architecture search with reinforcement learning. ICLR.Google Scholar
- B. Baker, O. Gupta, N. Naik, and R. Raskar. 2016. Designing neural network architectures using reinforcement learning. arXiv:1611.02167.Google Scholar
- N. Dalal and B. Triggs. 2005. Histograms of oriented gradients for human detection. In 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05). IEEE, Vol. 1, 886–893.Google Scholar
- D. G. Lowe. 1999. Object recognition from local scale-invariant features. In Proceedings of the 7th IEEE International Conference on Computer Vision. IEEE, Vol. 2, 1150–1157.Google ScholarCross Ref
- E. Real, S. Moore, A. Selle, S. Saxena, Y. L. Suematsu, J. Tan, ... and A. Kurakin. 2017. Large-scale evolution of image classifiers. In Proceedings of the 34th International Conference on Machine Learning, Volume 70, 2902–2911. JMLR. org.Google Scholar
- L. Xie and A. Yuille. 2017. Genetic CNN. In Proceedings of the IEEE International Conference on Computer Vision. 1379–1388.Google Scholar
- H. Liu, K. Simonyan, and Y. Yang. 2018. Darts: Differentiable architecture search. arXiv:1806.09055.Google Scholar
- Y. Shu, W. Wang, and S. Cai. 2019. Understanding architectures learnt by cell-based neural architecture search. arXiv:1909.09569.Google Scholar
- H. Pham, M. Y. Guan, B. Zoph, Q. V. Le, and J. Dean. 2018. Efficient neural architecture search via parameter sharing. arXiv:1802.03268.Google Scholar
- B. Baker, O. Gupta, R. Raskar, and N. Naik. 2017. Accelerating neural architecture search using performance prediction. arXiv:1705.10823.Google Scholar
- C. Li, J. Peng, L. Yuan, G. Wang, X. Liang, L. Lin, and X. Chang. 2020. Block-wisely supervised neural architecture search with knowledge distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1989–1998.Google Scholar
- G. Bender, P. Kindermans, B. Zoph, V. Vasudevan, and Q. Le. 2018. Understanding and simplifying one-shot architecture search. In Proceedings of the 35th International Conference on Machine Learning. PMLR, 80:550–559.Google Scholar
- A. Brock, T. Lim, J. M. Ritchie, and N. Weston. 2017. Smash: One-shot model architecture search through hypernetworks. arXiv:1708.05344.Google Scholar
- C. Sciuto, K. Yu, M. Jaggi, C. Musat, and M. Salzmann. 2019. Evaluating the search phase of neural architecture search. arXiv:1902.08142.Google Scholar
- M. Zhang, H. Li, S. Pan, X. Chang, and S. Su. 2020 Overcoming multi-model forgetting in one-shot NAS with diversity maximization. In Advances in Neural Information Processing Systems.Google Scholar
- X. Cheng, Y. Zhong, M. Harandi, Y. Dai, X. Chang, H. Li, T. Drummond, and Z. Ge. 2020. Hierarchical neural architecture search for deep stereo matching. In Advances in Neural Information Processing Systems.Google Scholar
- T. Elsken, J. H. Metzen, and F. Hutter. 2018. Neural architecture search: A survey. arXiv:1808.05377.Google Scholar
- M. Wistuba, A. Rawat, and T. Pedapati. 2019. A survey on neural architecture search. arXiv:1905.01392.Google Scholar
- M. Tan, B. Chen, R. Pang, V. Vasudevan, M. Sandler, A. Howard, and Q. V. Le. 2019. MnasNet: Platform-aware neural architecture search for mobile. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2820–2828.Google Scholar
- K. He, X. Zhang, S. Ren, and J. Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.Google Scholar
- C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, ... and A. Rabinovich. 2015. Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1–9.Google Scholar
- B. Zoph, V. Vasudevan, J. Shlens, and Q. V. Le. 2018. Learning transferable architectures for scalable image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8697–8710.Google Scholar
- Z. Zhong, J. Yan, W. Wu, J. Shao, and C. L. Liu. 2018. Practical block-wise neural network architecture generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2423–2432.Google Scholar
- H. Liu, K. Simonyan, O. Vinyals, C. Fernando, and K. Kavukcuoglu. 2017. Hierarchical representations for efficient architecture search. arXiv:1711.00436.Google Scholar
- J. D. Dong, A. C. Cheng, D. C. Juan, W. Wei, and M. Sun. 2018. DPP-Net: Device-aware progressive search for Pareto-optimal neural architectures. In Proceedings of the European Conference on Computer Vision (ECCV’18). 517–531.Google Scholar
- G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. 2017. Densely connected convolutional networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4700–4708.Google Scholar
- C. Liu, B. Zoph, M. Neumann, J. Shlens, W. Hua, L. J. Li, ... and K. Murphy. 2018. Progressive neural architecture search. In Proceedings of the European Conference on Computer Vision (ECCV’18). 19–34.Google Scholar
- T. Saikia, Y. Marrakchi, A. Zela, F. Hutter, and T. Brox. 2019. AutoDispNet: Improving disparity estimation with AutoML. In Proceedings of the IEEE International Conference on Computer Vision. 1812–1823.Google Scholar
- J. Cui, P. Chen, R. Li, S. Liu, X. Shen, and J. Jia. 2019. Fast and practical neural architecture search. In Proceedings of the IEEE International Conference on Computer Vision. 6509–6518.Google Scholar
- Y. Xiong, R. Mehta, and V. Singh. 2019. Resource constrained neural network architecture search: Will a submodularity assumption help? In Proceedings of the IEEE International Conference on Computer Vision. 1901–1910.Google Scholar
- M. S. Ryoo, A. J. Piergiovanni, M. Tan, and A. Angelova. 2019. AssembleNet: Searching for multi-stream neural connectivity in video architectures. arXiv:1905.13209.Google Scholar
- J. Dean, G. Corrado, R. Monga, K. Chen, M. Devin, M. Mao, ... and Q. V. Le. 2012. Large scale distributed deep networks. In Advances in Neural Information Processing Systems. 1223–1231.Google Scholar
- E. Real, A. Aggarwal, Y. Huang, and Q. V. Le. 2019. Regularized evolution for image classifier architecture search. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 4780–4789.Google Scholar
- X. Chen, L. Xie, J. Wu, and Q. Tian. 2019. Progressive differentiable architecture search: Bridging the depth gap between search and evaluation. In Proceedings of the IEEE International Conference on Computer Vision. 1294–1303.Google Scholar
- A. J. Piergiovanni, A. Angelova, A. Toshev, and M. S. Ryoo. 2019. Evolving space-time neural architectures for videos. In Proceedings of the IEEE International Conference on Computer Vision. 1793–1802.Google Scholar
- S. Xie, H. Zheng, C. Liu, and L. Lin. 2018. SNAS: Stochastic neural architecture search. arXiv:1812.09926.Google Scholar
- T. Chen, I. Goodfellow, and J. Shlens. 2015. Net2net: Accelerating learning via knowledge transfer. arXiv:1511.05641.Google Scholar
- M. Schuster and K. K. Paliwal. 1997. Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing 45, 11 (1997), 2673–2681.Google ScholarDigital Library
- P. Bashivan, M. Tensen, and J. J. DiCarlo. 2019. Teacher guided architecture search. In Proceedings of the IEEE International Conference on Computer Vision. 5320–5329.Google Scholar
- X. Zheng, R. Ji, L. Tang, B. Zhang, J. Liu, and Q. Tian. 2019. Multinomial distribution learning for effective neural architecture search. In Proceedings of the IEEE International Conference on Computer Vision. 1304–1313.Google Scholar
- H. Cai, T. Chen, W. Zhang, Y. Yu, and J. Wang. 2018. Efficient architecture search by network transformation. In 32nd AAAI Conference on Artificial Intelligence. 2787–2794.Google Scholar
- A. Ashok, N. Rhinehart, F. Beainy, and K. M. Kitani. 2017. N2N learning: Network to network compression via policy gradient reinforcement learning. arXiv:1709.06030.Google Scholar
- J. Mei, Y. Li, X. Lian, X. Jin, L. Yang, A. Yuille, and J. Yang. 2019. AtomNAS: Fine-grained end-to-end neural architecture search. arXiv:1912.09640.Google Scholar
- X. Gong, S. Chang, Y. Jiang, and Z. Wang. 2019. AutoGAN: Neural architecture search for generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision. 3224–3234.Google Scholar
- R. Pasunuru and M. Bansal. 2019. Continual and multi-task architecture search. arXiv:1906.05226.Google Scholar
- G. Hinton, O. Vinyals, and J. Dean. 2015. Distilling the knowledge in a neural network. arXiv:1503.02531.Google Scholar
- H. Cai, J. Yang, W. Zhang, S. Han, and Y. Yu. 2018. Path-level network transformation for efficient architecture search. arXiv:1806.02639.Google Scholar
- C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi. 2017. Inception-v4, inception-ResNet and the impact of residual connections on learning. In 31st AAAI Conference on Artificial Intelligence.Google Scholar
- C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2818–2826.Google Scholar
- J. Fang, Y. Sun, K. Peng, Q. Zhang, Y. Li, W. Liu, and X. Wang. 2020. Fast neural network adaptation via parameter remapping and architecture search. arXiv:2001.02525.Google Scholar
- K. Kandasamy, W. Neiswanger, J. Schneider, B. Poczos, and E. P. Xing. 2018. Neural architecture search with Bayesian optimisation and optimal transport. In Advances in Neural Information Processing Systems. 2016–2025.Google Scholar
- R. Negrinho and G. Gordon. 2017. DeepArchitect: Automatically designing and training deep architectures. arXiv:1704.08792.Google Scholar
- C. Liu, L. C. Chen, F. Schroff, H. Adam, W. Hua, A. L. Yuille, and L. Fei-Fei. 2019. Auto-DeepLab: Hierarchical neural architecture search for semantic image segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 82–92.Google Scholar
- S. Ding, T. Chen, X. Gong, W. Zha, and Z. Wang. 2020. AutoSpeech: Neural architecture search for speaker recognition. arXiv:2005.03215.Google Scholar
- Y. Zhang, Z. Qiu, J. Liu, T. Yao, D. Liu, and T. Mei. 2019. Customizable architecture search for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11641–11650.Google Scholar
- Y. Chen, T. Yang, X. Zhang, G. Meng, C. Pan, and J. Sun. 2019. DetNAS: Neural architecture search on object detection. arXiv:1903.10979.Google Scholar
- G. Anandalingam and T. L. Friesz. 1992. Hierarchical optimization: An introduction. Annals of Operations Research 34, 1 (1992), 1–11.Google ScholarDigital Library
- B. Colson, P. Marcotte, and G. Savard. 2007. An overview of bilevel optimization. Annals of Operations Research 153, 1 (2007), 235–256.Google ScholarCross Ref
- R. Shin, C. Packer, and D. Song. 2018. Differentiable Neural Network Architecture Search. ICLR.Google Scholar
- K. Ahmed and L. Torresani. 2018. MaskConnect: Connectivity learning by gradient descent. In Proceedings of the European Conference on Computer Vision (ECCV’18). 349–365.Google Scholar
- S. Saxena and J. Verbeek. 2016. Convolutional neural fabrics. In Advances in Neural Information Processing Systems. 4053–4061.Google Scholar
- K. Ahmed and L. Torresani. 2017. Connectivity learning in multi-branch networks. arXiv:1709.09582.Google Scholar
- T. Veniat and L. Denoyer. 2018. Learning time/memory-efficient deep architectures with budgeted super networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3492–3500.Google Scholar
- R. Luo, F. Tian, T. Qin, E. Chen, and T. Y. Liu. 2018. Neural architecture optimization. In Advances in Neural Information Processing Systems. 7816–7827.Google Scholar
- J. Chang, Y. Guo, G. Meng, S. Xiang, and C. Pan. 2019. DATA: Differentiable ArchiTecture approximation. In Advances in Neural Information Processing Systems. 874–884.Google Scholar
- G. Ghiasi, T. Y. Lin, and Q. V. Le. 2019. NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7036–7045.Google Scholar
- D. Tran, J. Ray, Z. Shou, S. F. Chang, and M. Paluri. 2017. ConvNet architecture search for spatiotemporal feature learning. arXiv:1708.05038.Google Scholar
- L. C. Chen, M. Collins, Y. Zhu, G. Papandreou, B. Zoph, F. Schroff, ... and J. Shlens. 2018. Searching for efficient multi-scale architectures for dense image prediction. In Advances in Neural Information Processing Systems. 8699–8710.Google Scholar
- C. Ying, A. Klein, E. Real, E. Christiansen, K. Murphy, and F. Hutter. 2019. NAS-bench-101: Towards reproducible neural architecture search. arXiv:1902.09635.Google Scholar
- Y. Jiang, C. Hu, T. Xiao, C. Zhang, and J. Zhu. 2019. Improved differentiable architecture search for language modeling and named entity recognition. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP’19). 3576–3581.Google Scholar
- J. Lafferty, A. McCallum, and F. C. Pereira. 2001. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. ICML 2001 (2001), 282–289.Google Scholar
- D. Koller and N. Friedman. 2009. Probabilistic Graphical Models: Principles and Techniques. MIT Press.Google Scholar
- X. Dong and Y. Yang. 2019. Searching for a robust neural architecture in four GPU hours. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1761–1770.Google Scholar
- Y. Xu, L. Xie, X. Zhang, X. Chen, G. Qi, Q. Tian, and H. Xiong. 2019. PC-DARTS: Partial channel connections for memory-efficient architecture search. arXiv:abs/1907.05737.Google Scholar
- T. Elsken, J. H. Metzen, and F. Hutter. 2017. Simple and efficient architecture search for convolutional neural networks. arXiv:1711.04528.Google Scholar
- A. Sharif Razavian, H. Azizpour, J. Sullivan, and S. Carlsson. 2014. CNN features off-the-shelf: An astounding baseline for recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. 806–813.Google Scholar
- B. Zoph, D. Yuret, J. May, and K. Knight. 2016. Transfer learning for low-resource neural machine translation. arXiv:1604.02201.Google Scholar
- M. T. Luong, Q. V. Le, I. Sutskever, O. Vinyals, and L. Kaiser. 2015. Multi-task sequence to sequence learning. arXiv:1511.06114.Google Scholar
- D. Ha, A. Dai, and Q. V. Le. 2016. Hypernetworks. arXiv:1609.09106.Google Scholar
- L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, and A. Talwalkar. 2017. Hyperband: Bandit-based configuration evaluation for hyperparameter optimization. In International Conference on Learning Representations 2017 (ICLR’17).Google Scholar
- C. Zhang, M. Ren, and R. Urtasun. 2018. Graph hypernetworks for neural architecture search. arXiv:1810.05749.Google Scholar
- X. Dong and Y. Yang. 2019. One-shot neural architecture search via self-evaluated template network. In Proceedings of the IEEE International Conference on Computer Vision. 3681–3690.Google Scholar
- M. Mirza and S. Osindero. 2014. Conditional generative adversarial nets. arXiv:1411.1784.Google Scholar
- T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. 2016. Improved techniques for training gans. In Advances in Neural Information Processing Systems. 2234–2242.Google Scholar
- T. Karras, T. Aila, S. Laine, and J. Lehtinen. 2017. Progressive growing of GANs for improved quality, stability, and variation. arXiv:1710.10196.Google Scholar
- N. T. Tran, T. A. Bui, and N. M. Cheung. 2018. Dist-GAN: An improved GAN using distance constraints. In Proceedings of the European Conference on Computer Vision (ECCV’18). 370–385.Google Scholar
- W. Wang, Y. Sun, and S. Halgamuge. 2018. Improving MMD-GAN training with repulsive loss function. arXiv:1812.09916.Google Scholar
- Q. Hoang, T. D. Nguyen, T. Le, and D. Phung. 2018. MGAN: Training generative adversarial nets with multiple generators. In ICLR.Google Scholar
- G. Ghiasi, T. Y. Lin, and Q. V. Le. 2019. NAS-FPN: Learning scalable feature pyramid architecture for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 7036–7045.Google Scholar
- H. Cai, C. Gan, and S. Han. 2019. Once for all: Train one network and specialize it for efficient deployment. arXiv:1908.09791.Google Scholar
- X. Chu, B. Zhang, R. Xu, and J. Li. 2019. FairNAS: Rethinking evaluation fairness of weight sharing neural architecture search. arXiv:1907.01845.Google Scholar
- X. Li, C. Lin, C. Li, M. Sun, W. Wu, J. Yan, and W. Ouyang. 2019. Improving one-shot NAS by suppressing the posterior fading. arXiv:1910.02543.Google Scholar
- B. Wu, X. Dai, P. Zhang, Y. Wang, F. Sun, Y. Wu, ... and K. Keutzer. 2019. FBNet: Hardware-aware efficient ConvNet design via differentiable neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10734–10742.Google Scholar
- H. Cai, L. Zhu, and S. Han. 2018. ProxylessNAS: Direct neural architecture search on target task and hardware. arXiv:1812.00332.Google Scholar
- L. Li and A. Talwalkar. 2019. Random search and reproducibility for neural architecture search. arXiv:1902.07638.Google Scholar
- Z. Guo, X. Zhang, H. Mu, W. Heng, Z. Liu, Y. Wei, and J. Sun. 2019. Single path one-shot neural architecture search with uniform sampling. arXiv:1904.00420.Google Scholar
- T. Domhan, J. T. Springenberg, and F. Hutter. 2015. Speeding up automatic hyperparameter optimization of deep neural networks by extrapolation of learning curves. In 24th International Joint Conference on Artificial Intelligence.Google Scholar
- A. Klein, S. Falkner, J. T. Springenberg, and F. Hutter. 2016. Learning curve prediction with Bayesian neural networks. In International Conference on Learning Representation. 184--194.Google Scholar
- A. Chandrashekaran and I. R. Lane. 2017. Speeding up hyper-parameter optimization by extrapolation of learning curves using previous builds. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 477–492.Google Scholar
- B. Deng, J. Yan, and D. Lin. 2017. Peephole: Predicting network performance before training. arXiv:1712.03351.Google Scholar
- A. Zela, T. Elsken, T. Saikia, Y. Marrakchi, T. Brox, and F. Hutter. 2020. Understanding and robustifying differentiable architecture search. In ICLR.Google Scholar
- J. Peng, M. Sun, Z. X. Zhang, T. Tan, and J. Yan. 2019. Efficient neural architecture transformation search in channel-level for object detection. In Advances in Neural Information Processing Systems. 14290–14299.Google Scholar
- Y. Zhu et al. [n.d.]. Deep subdomain adaptation network for image classification. In IEEE Transactions on Neural Networks and Learning Systems. DOI: 10.1109/TNNLS.2020.2988928.Google Scholar
- Miao Zhang, Huiqi Li, Shirui Pan, Xiaojun Chang, Chuan Zhou, Zongyuan Ge, and Steven W. Su. One-shot neural architecture search: Maximising diversity to overcome catastrophic forgetting. IEEE Transactions on Pattern Analysis and Machine Intelligence. DOI:10.1109/TPAMI.2020.3035351Google Scholar
- N. Nayman, A. Noy, T. Ridnik, I. Friedman, R. Jin, and L. Zelnik. 2019. XNAS: Neural architecture search with expert advice. In Advances in Neural Information Processing Systems. 1975–1985.Google Scholar
- S. Cao, X. Wang, and K. M. Kitani. 2019. Learnable embedding space for efficient neural architecture compression. arXiv:1902.00383.Google Scholar
- T. Elsken, J. H. Metzen, and F. Hutter. 2018. Efficient multi-objective neural architecture search via Lamarckian evolution. arXiv:1804.09081.Google Scholar
- X. Li, Y. Zhou, Z. Pan, and J. Feng. 2019. Partial order pruning: For best speed/accuracy trade-off in neural architecture search. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9145–9153.Google Scholar
- X. Dai, P. Zhang, B. Wu, H. Yin, F. Sun, Y. Wang, ... and P. Vajda. 2019. ChamNet: Towards efficient network design through platform-aware model adaptation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11398–11407.Google Scholar
- F. Liang, C. Lin, R. Guo, M. Sun, W. Wu, J. Yan, and W. Ouyang. 2019. Computation reallocation for object detection. arXiv:1912.11234.Google Scholar
- I. Fedorov, R. P. Adams, M. Mattina, and P. Whatmough. 2019. Sparse: Sparse architecture search for CNNs on resource-constrained microcontrollers. In Advances in Neural Information Processing Systems. 4978–4990.Google Scholar
- V. Nekrasov, H. Chen, C. Shen, and I. Reid. 2019. Fast neural architecture search of compact semantic segmentation models via auxiliary cells. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9126–9135.Google Scholar
- E. D. Cubuk, B. Zoph, D. Mane, V. Vasudevan, and Q. V. Le. 2019. Autoaugment: Learning augmentation strategies from data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 113–123.Google Scholar
- M. Tan and Q. V. Le. 2019. EfficientNet: Rethinking model scaling for convolutional neural networks. arXiv:1905.11946.Google Scholar
- X. Zhang, Q. Wang, J. Zhang, and Z. Zhong. 2019. Adversarial AutoAugment. arXiv:1912.11188.Google Scholar
- X. Dong and Y. Yang. 2019. Network pruning via transformable architecture search. In Advances in Neural Information Processing Systems. 759–770.Google Scholar
- Z. Lu, I. Whalen, V. Boddeti, Y. D. Dhebar, K. Deb, E. D. Goodman, and W. Banzhaf. 2018. NSGA-NET: A multi-objective genetic algorithm for neural architecture search. Computer Vision and Pattern Recognition.Google Scholar
- X. Dong, L. Liu, K. Musial, and B. Gabrys. 2020. NATS-Bench: Benchmarking NAS algorithms for architecture topology and size. arXiv:2009.00437.Google Scholar
- A. Yang, P. M. Esperança, and F. M. Carlucci. 2019. NAS evaluation is frustratingly hard. arXiv:1912.12522.Google Scholar
- M. Wistuba. 2018. Deep learning architecture search by neuro-cell-based evolution with function-preserving mutations. In Joint European Conference on Machine Learning and Knowledge Discovery in Databases. Springer, Cham, 243–258.Google Scholar
- T. DeVries and G. W. Taylor. 2017. Improved regularization of convolutional neural networks with cutout. arXiv:1708.04552.Google Scholar
- F. P. Casale, J. Gordon, and N. Fusi. 2019. Probabilistic neural architecture search. arXiv:abs/1902.05116.Google Scholar
- S. Zagoruyko and N. Komodakis. 2016. Wide residual networks. arXiv:1605.07146.Google Scholar
- X. Gastaldi. 2017. Shake-shake regularization. arXiv:1705.07485.Google Scholar
- Y. Yamada, M. Iwamura, and K. Kise. 2016. Deep pyramidal residual networks with separated stochastic depth. arXiv:1612.01230.Google Scholar
- G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger. 2016. Deep networks with stochastic depth. In European Conference on Computer Vision. Springer, Cham, 646–661.Google Scholar
- G. Larsson, M. Maire, and G. Shakhnarovich. 2016. FractalNet: Ultra-deep neural networks without residuals. arXiv:1605.07648.Google Scholar
- H. Zhou, M. Yang, J. Wang, and W. Pan. 2019. BayesNAS: A Bayesian approach for neural architecture search. arXiv:1905.04919.Google Scholar
- X. Zhang, X. Zhou, M. Lin, and J. Sun. 2018. Shufflenet: An extremely efficient convolutional neural network for mobile devices. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6848–6856.Google Scholar
- S. Xie, R. Girshick, P. Dollár, Z. Tu, and K. He. 2017. Aggregated residual transformations for deep neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1492–1500.Google Scholar
- X. Zhang, Z. Li, C. Change Loy, and D. Lin. 2017. PolyNet: A pursuit of structural diversity in very deep networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 718–726.Google Scholar
- Y. Chen, J. Li, H. Xiao, X. Jin, S. Yan, and J. Feng. 2017. Dual path networks. In Advances in Neural Information Processing Systems. 4467–4475.Google Scholar
- J. Devlin, M. W. Chang, K. Lee, and K. Toutanova. 2018. BERT: Pre-training of deep bidirectional transformers for language understanding. arXiv:1810.04805.Google Scholar
- J. Long, E. Shelhamer, and T. Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3431–3440.Google Scholar
- W. Sun, Z. Huang, M. Liang, T. Shao, and H. Bi. 2020. Cocoon image segmentation method based on fully convolutional networks. In Proceedings of the 7th Asia International Symposium on Mechatronics. Springer, Singapore, 832–843.Google Scholar
- G. Vecchio, S. Palazzo, D. Giordano, F. Rundo, and C. Spampinato. [n.d.]. MASK-RL: Multiagent video object segmentation framework through reinforcement learning. In IEEE Transactions on Neural Networks and Learning Systems. DOI: 10.1109/TNNLS.2019.2963282.Google Scholar
- Z. Ji, Y. Zhao, Y. Pang, X. Li and J. Han. [n.d.]. Deep attentive video summarization with distribution consistency learning. In IEEE Transactions on Neural Networks and Learning Systems. DOI: 10.1109/TNNLS.2020.2991083.Google Scholar
- D. Zhang, J. Han, L. Zhao, and T. Zhao. [n.d.]. From discriminant to complete: Reinforcement searching-agent learning for weakly supervised object detection. In IEEE Transactions on Neural Networks and Learning Systems. DOI: 10.1109/TNNLS.2020.2969483.Google Scholar
- K. Shih, C. Chiu, J. Lin, and Y. Bu. 2020. Real-time object detection with reduced-region proposal network via multi-feature concatenation. In IEEE Transactions on Neural Networks and Learning Systems. 31, 6 (June 2020, pp. 2164–2173. DOI: 10.1109/TNNLS.2019.2929059.Google ScholarCross Ref
- Y. Zhou, G. G. Yen, and Z. Yi. [n.d.]. Evolutionary compression of deep neural networks for biomedical image segmentation. In IEEE Transactions on Neural Networks and Learning Systems. DOI: 10.1109/TNNLS.2019.2933879.Google Scholar
- B. Zhang, D. Xiong, J. Xie, and J. Su. [n.d.]. Neural machine translation with GRU-gated attention model. In IEEE Transactions on Neural Networks and Learning Systems. DOI: 10.1109/TNNLS.2019.2957276.Google Scholar
- M. Guo, Y. Yang, R. Xu, and Z. Liu. 2019. When NAS meets robustness: In search of robust architectures against adversarial attacks. arXiv:abs/1911.10695.Google Scholar
- G. Li, G. Qian, I. C. Delgadillo, M. Müller, A. Thabet, and B. Ghanem. 2019. SGAS: Sequential greedy architecture search. arXiv:1912.00195.Google Scholar
- J. Fang, Y. Sun, Q. Zhang, Y. Li, W. Liu, and X. Wang. 2019. Densely connected search space for more flexible neural architecture search. arXiv:1906.09607.Google Scholar
- D. So, C. Liang, and Q. V. Le. 2019. The evolved transformer. arXiv:abs/1901.11117.Google Scholar
- A. Zela, J. Siems, and F. Hutter. 2020. NAS-Bench-1Shot1: Benchmarking and dissecting one-shot neural architecture search. arXiv:2001.10422.Google Scholar
- X. Dai, A. Wan, P. Zhang, B. Wu, Z. He, Z. Wei, ... and J. E. Gonzalez. 2020. FBNetV3: Joint architecture-recipe search using neural acquisition function. arXiv:2006.02049.Google Scholar
- X. Dong, M. Tan, A. W. Yu, D. Peng, B. Gabrys, and Q. V. Le. 2020. AutoHAS: Differentiable hyper-parameter and architecture search. arXiv:2006.03656.Google Scholar
- H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz. 2017. mixup: Beyond empirical risk minimization. arXiv:1710.09412.Google Scholar
- G. Huang, Y. Sun, Z. Liu, D. Sedra, and K. Q. Weinberger. 2016. Deep networks with stochastic depth. In European Conference on Computer Vision. Springer, Cham, 646–661.Google Scholar
- D. R. Jones. 2001. A taxonomy of global optimization methods based on response surfaces. Journal of Global Optimization 21, 4 (2001), 345–383.Google ScholarDigital Library
- F. Hutter, H. H. Hoos, and K. Leyton-Brown. 2011. Sequential model-based optimization for general algorithm configuration. In International Conference on Learning and Intelligent Optimization. Springer, Berlin, 507–523.Google Scholar
Index Terms
- A Comprehensive Survey of Neural Architecture Search: Challenges and Solutions
Recommendations
Neural architecture search: a survey
Deep Learning has enabled remarkable progress over the last years on a variety of tasks, such as image recognition, speech recognition, and machine translation. One crucial aspect for this progress are novel neural architectures. Currently employed ...
A Training-free Genetic Neural Architecture Search
ACM ICEA '21: Proceedings of the 2021 ACM International Conference on Intelligent Computing and its Emerging ApplicationsThe so-called neural architecture search (NAS) provides an alternative way to construct a "good neural architecture," which would normally outperform hand-made architectures, for solving complex problems without domain knowledge. However, a critical ...
Graph neural architecture prediction
AbstractGraph neural networks (GNNs) have shown their superiority in the modeling of graph data. Recently, increasing attention has been paid to automatic graph neural architecture search, aiming to overcome the shortcomings of manually constructing GNN ...
Comments