class DeepSeekAuditChecker: def init(self, api_key: str, base_url: str = "https://api.deepseek.com/v1/chat/completions"): """ 初始化DeepSeek API调用器 Args: api_key: DeepSeek API密钥 base_url: API基础URL """ self.api_key = api_key self.base_url = base_url self.headers = { "Content-Type": "application/json", "Authorization": f"Bearer {api_key}" } self.lock = threading.Lock() #

: str, output_file_path: str, num_threads: int = 4) -> None: """ 多线程处理CSV文件 Args: csv_file_path: 输入CSV文件路径 output_file_path: 输出CSV文件路径 num_threads: 线程数量 """ # 读取CSV文件 df = pd.read_csv(csv_file_path) self.total_count = len(df) self.processed_count = 0 # 准备数据 data = [(i, row['original_office'], row['matched_uniformname']) for i, row in df.iterrows()] # 创建线程池 with ThreadPoolExecutor(max_workers=num_threads) as executor: results = list(executor.map(self.process_single_row, data)) # 更新结果到DataFrame for index, result in results: df.at[index, 'is_same'] = result # 保存结果 df.to_csv(output_file_path, index=False) print(f"处理完成，结果已保存到: {output_file_path}") # 使用示例 if __name__ == "__main__": import threading import time import requests import pandas as pd from concurrent.futures import ThreadPoolExecutor checker = DeepSeekAuditChecker(api_key="your_api_key_here") checker.process_csv_multithreaded( csv_file_path="input.csv", output_file_path="output.csv", num_threads=4 ) 代码说明： 1. 类初始化需要DeepSeek API密钥 2. call_deepseek_api方法构建prompt并调用API 3. process_single_row方法处理单行数据 4. process_csv_multithreaded方法多线程处理整个CSV文件 5. 结果会新增is_same列，1表示是同一家事务所，0表示不是注意事项： 1. 需要安装requests和pandas库 2. API调用有频率限制，建议设置适当延迟 3. 多线程打印使用锁保证线程安全 4. 空值会自动跳过并标记为0 [2025-06-01 00:22:24 | AI问答 | 471点数解答]

人生没有白走的路，每一步都算数！Ctrl + D 收藏本站，开发、写文章快人一步~

全部AI / 热门 / AI问答

服务商

更多选项

快捷项

其他AI