瀏覽代碼

feat: cache huggingface gpt2 tokenizer files (#1138)

takatost 1 年之前
父節點
當前提交
877da82b06
共有 1 個文件被更改,包括 3 次插入1 次删除
  1. 3 1
      api/Dockerfile

+ 3 - 1
api/Dockerfile

@@ -26,7 +26,7 @@ EXPOSE 5001
 
 
 WORKDIR /app/api
 WORKDIR /app/api
 
 
-RUN apt-get update \ 
+RUN apt-get update \
     && apt-get install -y --no-install-recommends bash curl wget vim nodejs \
     && apt-get install -y --no-install-recommends bash curl wget vim nodejs \
     && apt-get autoremove \
     && apt-get autoremove \
     && rm -rf /var/lib/apt/lists/*
     && rm -rf /var/lib/apt/lists/*
@@ -34,6 +34,8 @@ RUN apt-get update \
 COPY --from=base /pkg /usr/local
 COPY --from=base /pkg /usr/local
 COPY . /app/api/
 COPY . /app/api/
 
 
+RUN python -c "from transformers import GPT2TokenizerFast; GPT2TokenizerFast.from_pretrained('gpt2')"
+
 COPY docker/entrypoint.sh /entrypoint.sh
 COPY docker/entrypoint.sh /entrypoint.sh
 RUN chmod +x /entrypoint.sh
 RUN chmod +x /entrypoint.sh