merge frontier branch (#1620)

* Zhipu sdk update 适配最新的智谱SDK，支持GLM4v (#1502) * 适配 google gemini 优化为从用户input中提取文件 * 适配最新的智谱SDK、支持glm-4v * requirements.txt fix * pending history check --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> * Update "生成多种Mermaid图表" plugin: Separate out the file reading function (#1520) * Update crazy_functional.py with new functionality deal with PDF * Update crazy_functional.py and Mermaid.py for plugin_kwargs * Update crazy_functional.py with new chart type: mind map * Update SELECT_PROMPT and i_say_show_user messages * Update ArgsReminder message in get_crazy_functions() function * Update with read md file and update PROMPTS * Return the PROMPTS as the test found that the initial version worked best * Update Mermaid chart generation function * version 3.71 * 解决issues #1510 * Remove unnecessary text from sys_prompt in 解析历史输入 function * Remove sys_prompt message in 解析历史输入 function * Update bridge_all.py: supports gpt-4-turbo-preview (#1517) * Update bridge_all.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update bridge_all.py --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * Update config.py: supports gpt-4-turbo-preview (#1516) * Update config.py: supports gpt-4-turbo-preview supports gpt-4-turbo-preview * Update config.py --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * Refactor 解析历史输入 function to handle file input * Update Mermaid chart generation functionality * rename files and functions --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> Co-authored-by: hongyi-zhao <hongyi.zhao@gmail.com> Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * 接入mathpix ocr功能 (#1468) * Update Latex输出PDF结果.py 借助mathpix实现了PDF翻译中文并重新编译PDF * Update config.py add mathpix appid & appkey * Add 'PDF翻译中文并重新编译PDF' feature to plugins. --------- Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * fix zhipuai * check picture * remove glm-4 due to bug * 修改config * 检查MATHPIX_APPID * Remove unnecessary code and update function_plugins dictionary * capture non-standard token overflow * bug fix #1524 * change mermaid style * 支持mermaid 滚动放大缩小重置,鼠标滚动和拖拽 (#1530) * 支持mermaid 滚动放大缩小重置,鼠标滚动和拖拽 * 微调未果先stage一下 * update --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> Co-authored-by: binary-husky <96192199+binary-husky@users.noreply.github.com> * ver 3.72 * change live2d * save the status of ``clear btn` in cookie * 前端选择保持 * js ui bug fix * reset btn bug fix * update live2d tips * fix missing get_token_num method * fix live2d toggle switch * fix persistent custom btn with cookie * fix zhipuai feedback with core functionality * Refactor button update and clean up functions * tailing space removal * Fix missing MATHPIX_APPID and MATHPIX_APPKEY configuration * Prompt fix、脑图提示词优化 (#1537) * 适配 google gemini 优化为从用户input中提取文件 * 脑图提示词优化 * Fix missing MATHPIX_APPID and MATHPIX_APPKEY configuration --------- Co-authored-by: binary-husky <qingxu.fu@outlook.com> * 优化“PDF翻译中文并重新编译PDF”插件 (#1602) * Add gemini_endpoint to API_URL_REDIRECT (#1560) * Add gemini_endpoint to API_URL_REDIRECT * Update gemini-pro and gemini-pro-vision model_info endpoints * Update to support new claude models (#1606) * Add anthropic library and update claude models * 更新bridge_claude.py文件，添加了对图片输入的支持。修复了一些bug。 * 添加Claude_3_Models变量以限制图片数量 * Refactor code to improve readability and maintainability * minor claude bug fix * more flexible one-api support * reformat config * fix one-api new access bug * dummy * compat non-standard api * version 3.73 --------- Co-authored-by: XIao <46100050+Kilig947@users.noreply.github.com> Co-authored-by: Menghuan1918 <menghuan2003@outlook.com> Co-authored-by: hongyi-zhao <hongyi.zhao@gmail.com> Co-authored-by: Hao Ma <893017927@qq.com> Co-authored-by: zeyuan huang <599012428@qq.com>
2024-03-11 17:26:09 +08:00
parent cd18663800
commit c3140ce344
85 changed files with 866 additions and 642 deletions
--- a/crazy_functions/latex_fns/latex_actions.py
+++ b/crazy_functions/latex_fns/latex_actions.py
@@ -90,16 +90,16 @@ class LatexPaperSplit():
            "版权归原文作者所有。翻译内容可靠性无保障，请仔细鉴别并以原文为准。" + \
            "项目Github地址 \\url{https://github.com/binary-husky/gpt_academic/}。"
        # 请您不要删除或修改这行警告，除非您是论文的原作者（如果您是论文原作者，欢迎加REAME中的QQ联系开发者）
-        self.msg_declare = "为了防止大语言模型的意外谬误产生扩散影响，禁止移除或修改此警告。}}\\\\" 
+        self.msg_declare = "为了防止大语言模型的意外谬误产生扩散影响，禁止移除或修改此警告。}}\\\\"
        self.title = "unknown"
        self.abstract = "unknown"

    def read_title_and_abstract(self, txt):
        try:
            title, abstract = find_title_and_abs(txt)
-            if title is not None: 
+            if title is not None:
                self.title = title.replace('\n', ' ').replace('\\\\', ' ').replace('  ', '').replace('  ', '')
-            if abstract is not None: 
+            if abstract is not None:
                self.abstract = abstract.replace('\n', ' ').replace('\\\\', ' ').replace('  ', '').replace('  ', '')
        except:
            pass
@@ -111,7 +111,7 @@ class LatexPaperSplit():
        result_string = ""
        node_cnt = 0
        line_cnt = 0
-        
+
        for node in self.nodes:
            if node.preserve:
                line_cnt += node.string.count('\n')
@@ -144,7 +144,7 @@ class LatexPaperSplit():
        return result_string


-    def split(self, txt, project_folder, opts): 
+    def split(self, txt, project_folder, opts):
        """
        break down latex file to a linked list,
        each node use a preserve flag to indicate whether it should
@@ -155,7 +155,7 @@ class LatexPaperSplit():
        manager = multiprocessing.Manager()
        return_dict = manager.dict()
        p = multiprocessing.Process(
-            target=split_subprocess, 
+            target=split_subprocess,
            args=(txt, project_folder, return_dict, opts))
        p.start()
        p.join()
@@ -217,13 +217,13 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
    from ..crazy_utils import request_gpt_model_multi_threads_with_very_awesome_ui_and_high_efficiency
    from .latex_actions import LatexPaperFileGroup, LatexPaperSplit

-    #  <-------- 寻找主tex文件 ----------> 
+    #  <-------- 寻找主tex文件 ---------->
    maintex = find_main_tex_file(file_manifest, mode)
    chatbot.append((f"定位主Latex文件", f'[Local Message] 分析结果：该项目的Latex主文件是{maintex}, 如果分析错误, 请立即终止程序, 删除或修改歧义文件, 然后重试。主程序即将开始, 请稍候。'))
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    time.sleep(3)

-    #  <-------- 读取Latex文件, 将多文件tex工程融合为一个巨型tex ----------> 
+    #  <-------- 读取Latex文件, 将多文件tex工程融合为一个巨型tex ---------->
    main_tex_basename = os.path.basename(maintex)
    assert main_tex_basename.endswith('.tex')
    main_tex_basename_bare = main_tex_basename[:-4]
@@ -240,13 +240,13 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
    with open(project_folder + '/merge.tex', 'w', encoding='utf-8', errors='replace') as f:
        f.write(merged_content)

-    #  <-------- 精细切分latex文件 ----------> 
+    #  <-------- 精细切分latex文件 ---------->
    chatbot.append((f"Latex文件融合完成", f'[Local Message] 正在精细切分latex文件，这需要一段时间计算，文档越长耗时越长，请耐心等待。'))
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面
    lps = LatexPaperSplit()
    lps.read_title_and_abstract(merged_content)
    res = lps.split(merged_content, project_folder, opts) # 消耗时间的函数
-    #  <-------- 拆分过长的latex片段 ----------> 
+    #  <-------- 拆分过长的latex片段 ---------->
    pfg = LatexPaperFileGroup()
    for index, r in enumerate(res):
        pfg.file_paths.append('segment-' + str(index))
@@ -255,17 +255,17 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
    pfg.run_file_split(max_token_limit=1024)
    n_split = len(pfg.sp_file_contents)

-    #  <-------- 根据需要切换prompt ----------> 
+    #  <-------- 根据需要切换prompt ---------->
    inputs_array, sys_prompt_array = switch_prompt(pfg, mode)
    inputs_show_user_array = [f"{mode} {f}" for f in pfg.sp_file_tag]

    if os.path.exists(pj(project_folder,'temp.pkl')):

-        #  <-------- 【仅调试】如果存在调试缓存文件，则跳过GPT请求环节 ----------> 
+        #  <-------- 【仅调试】如果存在调试缓存文件，则跳过GPT请求环节 ---------->
        pfg = objload(file=pj(project_folder,'temp.pkl'))

    else:
-        #  <-------- gpt 多线程请求 ----------> 
+        #  <-------- gpt 多线程请求 ---------->
        history_array = [[""] for _ in range(n_split)]
        # LATEX_EXPERIMENTAL, = get_conf('LATEX_EXPERIMENTAL')
        # if LATEX_EXPERIMENTAL:
@@ -284,32 +284,32 @@ def Latex精细分解与转化(file_manifest, project_folder, llm_kwargs, plugin
            scroller_max_len = 40
        )

-        #  <-------- 文本碎片重组为完整的tex片段 ----------> 
+        #  <-------- 文本碎片重组为完整的tex片段 ---------->
        pfg.sp_file_result = []
        for i_say, gpt_say, orig_content in zip(gpt_response_collection[0::2], gpt_response_collection[1::2], pfg.sp_file_contents):
            pfg.sp_file_result.append(gpt_say)
        pfg.merge_result()

-        # <-------- 临时存储用于调试 ----------> 
+        # <-------- 临时存储用于调试 ---------->
        pfg.get_token_num = None
        objdump(pfg, file=pj(project_folder,'temp.pkl'))

    write_html(pfg.sp_file_contents, pfg.sp_file_result, chatbot=chatbot, project_folder=project_folder)

-    #  <-------- 写出文件 ----------> 
+    #  <-------- 写出文件 ---------->
    msg = f"当前大语言模型: {llm_kwargs['llm_model']}，当前语言模型温度设定: {llm_kwargs['temperature']}。"
    final_tex = lps.merge_result(pfg.file_result, mode, msg)
    objdump((lps, pfg.file_result, mode, msg), file=pj(project_folder,'merge_result.pkl'))

    with open(project_folder + f'/merge_{mode}.tex', 'w', encoding='utf-8', errors='replace') as f:
        if mode != 'translate_zh' or "binary" in final_tex: f.write(final_tex)
-        

-    #  <-------- 整理结果, 退出 ----------> 
+
+    #  <-------- 整理结果, 退出 ---------->
    chatbot.append((f"完成了吗？", 'GPT结果已输出, 即将编译PDF'))
    yield from update_ui(chatbot=chatbot, history=history) # 刷新界面

-    #  <-------- 返回 ----------> 
+    #  <-------- 返回 ---------->
    return project_folder + f'/merge_{mode}.tex'


@@ -362,7 +362,7 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f

        yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译转化后的PDF ...', chatbot, history)   # 刷新Gradio前端界面
        ok = compile_latex_with_timeout(f'pdflatex -interaction=batchmode -file-line-error {main_file_modified}.tex', work_folder_modified)
-        
+
        if ok and os.path.exists(pj(work_folder_modified, f'{main_file_modified}.pdf')):
            # 只有第二步成功，才能继续下面的步骤
            yield from update_ui_lastest_msg(f'尝试第 {n_fix}/{max_try} 次编译, 编译BibTex ...', chatbot, history)    # 刷新Gradio前端界面
@@ -393,9 +393,9 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
        original_pdf_success = os.path.exists(pj(work_folder_original, f'{main_file_original}.pdf'))
        modified_pdf_success = os.path.exists(pj(work_folder_modified, f'{main_file_modified}.pdf'))
        diff_pdf_success     = os.path.exists(pj(work_folder, f'merge_diff.pdf'))
-        results_ += f"原始PDF编译是否成功: {original_pdf_success};" 
-        results_ += f"转化PDF编译是否成功: {modified_pdf_success};" 
-        results_ += f"对比PDF编译是否成功: {diff_pdf_success};" 
+        results_ += f"原始PDF编译是否成功: {original_pdf_success};"
+        results_ += f"转化PDF编译是否成功: {modified_pdf_success};"
+        results_ += f"对比PDF编译是否成功: {diff_pdf_success};"
        yield from update_ui_lastest_msg(f'第{n_fix}编译结束:<br/>{results_}...', chatbot, history) # 刷新Gradio前端界面

        if diff_pdf_success:
@@ -409,7 +409,7 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
                shutil.copyfile(result_pdf, pj(work_folder, '..', 'translation', 'translate_zh.pdf'))
            promote_file_to_downloadzone(result_pdf, rename_file=None, chatbot=chatbot)  # promote file to web UI
            # 将两个PDF拼接
-            if original_pdf_success: 
+            if original_pdf_success:
                try:
                    from .latex_toolbox import merge_pdfs
                    concat_pdf = pj(work_folder_modified, f'comparison.pdf')
@@ -425,7 +425,7 @@ def 编译Latex(chatbot, history, main_file_original, main_file_modified, work_f
            if n_fix>=max_try: break
            n_fix += 1
            can_retry, main_file_modified, buggy_lines = remove_buggy_lines(
-                file_path=pj(work_folder_modified, f'{main_file_modified}.tex'), 
+                file_path=pj(work_folder_modified, f'{main_file_modified}.tex'),
                log_path=pj(work_folder_modified, f'{main_file_modified}.log'),
                tex_name=f'{main_file_modified}.tex',
                tex_name_pure=f'{main_file_modified}',
@@ -445,14 +445,14 @@ def write_html(sp_file_contents, sp_file_result, chatbot, project_folder):
        import shutil
        from crazy_functions.pdf_fns.report_gen_html import construct_html
        from toolbox import gen_time_str
-        ch = construct_html() 
+        ch = construct_html()
        orig = ""
        trans = ""
        final = []
-        for c,r in zip(sp_file_contents, sp_file_result): 
+        for c,r in zip(sp_file_contents, sp_file_result):
            final.append(c)
            final.append(r)
-        for i, k in enumerate(final): 
+        for i, k in enumerate(final):
            if i%2==0:
                orig = k
            if i%2==1: