streamlit pandas torch transformers stanza spacy python-docx langdetect openpyxl xlsxwriter lxml[html_clean] newspaper3k==0.2.8 google-cloud-aiplatform>=1.66.0