Dummy thoughts
- Use Gaussian mask as GT mask for training mask-rcnn on wider face, the segmentation branch works as an auxiliary loss. 
- Mask RCNN segmentation branch incorporating with low level features. 
- Use external memory to process each video. The memory is established online and for each video, we access the memory as well as current frame for doing either detection / segmentation / tracking. 
- Semantic segmentation + embedding -> panoramic segmentation 
- Supervise SPP module with hand coded GT 
- Supervise Spatial Attention with hand coded GT 
- ShuffleNet V2 x0.5 is good, but not x1.0. Identically copy x0.5 model as initial for x 1.0. 
- Like what BERT did for pre-training. Do unsupervised pre-training on masked out images to get a better performance feature extractor than training on ImageNet. - collect more than 1 million images 
- random masked out and add noise in the rest of pixels 
- train an encoder-decoder images filling hole network. e.g. GAN for in-painting. 
- Use the encoder part of GAN as pure-backbone. 
- Then fine-tune this backbone on other any CV tasks, classification, detection, segmentation etc. 
 
- Samsung's one shot deep fake 
- Some Beijing university student use DDPG to generate paintings stroke by stroke. 
- Distillation from [ high performance models' Grad-Cam results ] to [ the supervised spatial attentions of efficient structures ]. 
Applications
- User crawler to get music lyrics and apply Chinese (NetEase) / English (Spotify) BERT on them. Do clustering and music genre recommendation. - Could be personalized. - Input your favorite list, cluster on it and recommend according to the nearest distance to cluster centers. 
 
- Could generate offline - Create content categories offline [Life, love, depression, happiness, etc.] 
 
 
- Using BERT to do paper content embedding. Then generate the similarity between different papers and check with the reference relationship. 
- Use BERT to do paper summarization. Abstract is the summarized info and the paper it self can be treated as the full text. 
Last updated
