Weakly Supervised Vision and Language Representation Learning in Sign Language Understanding