Stars
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
[验证码识别-训练] This project is based on CNN/ResNet/DenseNet+GRU/LSTM+CTC/CrossEntropy to realize verification code identification. This project is only for training the model.
Unofficial PyTorch implementation of Masked Autoencoders Are Scalable Vision Learners
ALIbaba's Collection of Encoder-decoders from MinD (Machine IntelligeNce of Damo) Lab
Neural machine translation and sequence learning using TensorFlow
Generate text captions for images from their embeddings.
Multimodal model training on aiMotive Dataset
基于google官方的models中的SSD_MobileNet网络实现车辆的检测,使用Kalman实现对目标的跟踪
Dataset loader and renderer for aiMotive Multimodal Dataset
DB, OCR, Pytorch, 文本检测算法,A PyToch implementation of "Real-time Scene Text Detection with Differentiable Binarization".
百度AI Studio的问答摘要与推理项目,使用Seq2Seq+Attention实现,使用Beam Search和惩罚重复来解决预测文本重复的问题,使用PGN网络来解决生成摘要存在OOV词汇的问题
Focal_loss, K-Fold, 疫情期间网民情绪识别, Bert, CNN, 动态学习率
pytorch, ocr, crnn, ctc loss, variable width, variable height
yolov3的Tensorflow实现--基于Python3
Few shot classification implementation using the feature of latest released CLIP by openai
deeplabv3p, atros_resnet, xception, focal loss, 百度阿波罗数据集
