Is Vision Encoder worth? Is there any big difference if you use it?