This is a tutorial summary. The libtorch tutorial has eight chapters, mainly to implement the classification, segmentation and detection tools of the C++ version. For some comrades who need C++/C#/Java for projects, it must be a great boon.
Since the code is almost hand typed throughout, unlike many current python projects, you can directly borrow from other places through CV (ctr C + ctr V). At the same time, debugging of C++ projects is much more troublesome than python. It takes time to compile once the project is large. I can only say that C++ is dirty work, which makes people bald 👨🦲.
But now in deep learning-related positions, especially CV posts, there are really not many HCs in python. The big companies are okay, but the requirements are high, and the small companies have lower but basic requirements such as deployment. Generally, large companies and research institutes can conduct training and deployment separately. Small and medium-sized companies generally require both C++ and Python. Even many large companies require training and deployment of a package at the same time. Therefore, it is really important to do deep learning-related, some requirements for C++, or deployment and tuning.
In fact, this chapter is not a tutorial, it is better to say that it is a piece of thought. I originally wanted to share a GAN-related thing, but I was really busy and too lazy to do it. In fact, they are almost the same. After basically finishing the previous chapters, the whole libtorch's DCGAN can't be too simple.
Some programming experience about libtorch, sum it up:
- There are many pitfalls in libtorch. For example, sequential in sequential will report an error, and it cannot be solved if another sequential class of stack is written...The CPU is sometimes slower than python...Although the GPU is generally 30% faster.
- The sequential in libtorch cannot stack std::vector<torch::Tensor> as the input module. Such as the yolov5 model. It has a ConCat module, which can not be used in libtorch's sequential. For a while, I don't know whether to blame the yolov5 author for poor code requirements, libtorch for garbage, or for lack of resources. So in the end a yolov4_tiny was made.
- Libtorch still has a lot of things to be optimized, such as speeding up. If it can call an api to use int8 for predicting, the market share of TensorRT may have to shrink. Of course, it should not be as fast as TensorRt. After all, it is impossible to use Nvidia graphics cards better...
- If possible, it is better to write your own model in python to train a pre-trained weight. And then load and finetune in libtorch, which is better than finding suitable open source projects and weights on the internet. Of course, if you really have good energy, you can train the libtorch model from scratch. Maybe when my project is completed, libtorch can also be trained from scratch. A lot of things need to be added, and a lot of data enhancements are needed. Some learning rate adjustment strategies must be implemented by yourself. With time and resources, it should still be possible. After all, libtorch has taken over the banner of caffe, and a large company Facebook is maintaining it.
In the follow-up, I will focus on two open source projects, the LibtorchSegment project and the LibtorchDetection project. Strive to provide as many backbone networks, frameworks, data enhancements and training optimization interfaces as possible.